electrovir / pdf-text-reader

Dead simple pdf text reader
https://electrovir.github.io/pdf-text-reader/
Creative Commons Zero v1.0 Universal
33 stars 4 forks source link

fix spaceCount calculation #5

Closed Remi-p closed 2 years ago

Remi-p commented 2 years ago

Hi! Thank you for this repository (:.

It seems we encountered a bug with my colleagues (cc: @LuisCardosoOliveira).

The error was:

    RangeError: Invalid array length
      at parsePageItems (node_modules/pdf-text-reader/dist/index.js:93:25)
      at parsePage (node_modules/pdf-text-reader/dist/index.js:31:12)
      at async readPdfText (node_modules/pdf-text-reader/dist/index.js:17:20)

After investigating a little, it seems like lastItem.height can be equal to 0.

This leads to spaceCount being equal to Infinity (from line: ) https://github.com/electrovir/pdf-text-reader/blob/06c35be5c7532777c9ce619334d34152d0dda424/src/index.ts#L156

As we're doing an Array(spaceCount) afterwards, it makes the library throw an error. https://github.com/electrovir/pdf-text-reader/blob/06c35be5c7532777c9ce619334d34152d0dda424/src/index.ts#L159


Adding a condition on lastItem.height !== 0 corrects the behavior.


npm run test was launched and didn't return any errors. Please feel free to tell us if it's OK for you!

electrovir commented 2 years ago

Nice work! Thank you!

electrovir commented 2 years ago

This has been included in version 3.0.2: https://www.npmjs.com/package/pdf-text-reader/v/3.0.2