Hopding / pdf-lib

Create and modify PDF documents in any JavaScript environment
https://pdf-lib.js.org
MIT License
6.77k stars 647 forks source link

Error loading pdf, "invalid object" #1400

Open Hypnobrew opened 1 year ago

Hypnobrew commented 1 year ago

What were you trying to do?

Hi

Thanks a lot for a great library for working with pdfs. Usually it works fine but sometimes I am getting into trouble where the lib says it has invalid objects when loading the content.

Validating the pdf on a site like this says the file is ok.

How did you attempt to do it?

Nothing fancy, just basic loading the pdf file like:

const data = fs.readFileSync('../Assa Abloy - Q3 2022 - Conference Call Deck.pdf', 'utf8');  

const buffer = Buffer.from(data);  

PDFDocument.load(buffer, { ignoreEncryption: true })`. 

What actually happened?

Getting errors like these:

116
%%EOF

Trying to parse invalid object: {"line":4,"column":16,"offset":124})
Invalid object ref: 5302 0 R
Trying to parse invalid object: {"line":53,"column":16,"offset":3134})
Invalid object ref: 5283 0 R
Trying to parse invalid object: {"line":9432,"column":16,"offset":716118})
Invalid object ref: 4 0 R
....

What did you expect to happen?

Parsing without any errors.

How can we reproduce the issue?

Run a simple example with the file I provided.

Version

1.17.1

What environment are you running pdf-lib in?

Node

Checklist

Additional Notes

Here is an example file with these problems: Assa Abloy - Q3 2022 - Conference Call Deck.pdf

Hypnobrew commented 1 year ago

If there is something in the pdfs that cannot be handled by pdf-lib, maybe it would be wise to be able to set an option to ignore these objects?

jlindema1 commented 1 year ago

I too am having a problem very similar to this... just trying to load the PDF into pdf-lib. Here're the console.warn's and the link to the PDF's. The PDF's are fillable-forms and work fine in various tools like pdfescape.com.

(unloadable) example PDF: https://toastmasterscdn.azureedge.net/medias/files/department-documents/speech-contests-documents/speech-contest-certificate-sets/en/510d-speech-contest-certificate_ff.pdf

content size,type: 641288 application/pdf

Invalid object ref: 48 0 R
Trying to parse invalid object: {"line":51,"column":17,"offset":631781})
Invalid object ref: 2 0 R
Trying to parse invalid object: {"line":58,"column":17,"offset":633010})
Invalid object ref: 3 0 R
Trying to parse invalid object: {"line":67,"column":17,"offset":633693})
Invalid object ref: 4 0 R
Trying to parse invalid object: {"line":73,"column":17,"offset":640544})
Invalid object ref: 6 0 R
Trying to parse invalid object: {"line":76,"column":17,"offset":640685})
Invalid object ref: 7 0 R

another (unloadable) PDF: https://toastmasterscdn.azureedge.net/medias/files/department-documents/speech-contests-documents/speech-contest-certificate-sets/en/510c-speech-contest-certificate_ff.pdf

content size,type: 642929 application/pdf

Invalid object ref: 55 0 R
Trying to parse invalid object: {"line":53,"column":17,"offset":633312})
Invalid object ref: 2 0 R
Trying to parse invalid object: {"line":68,"column":17,"offset":634541})
Invalid object ref: 3 0 R
Trying to parse invalid object: {"line":73,"column":17,"offset":635335})
Invalid object ref: 4 0 R
Trying to parse invalid object: {"line":77,"column":17,"offset":642186})
Invalid object ref: 6 0 R
Trying to parse invalid object: {"line":80,"column":17,"offset":642327})
Invalid object ref: 7 0 R
alexyoung23jj commented 3 months ago

Anyone find a solution here?

andy008 commented 1 week ago

I'm also having this same issue:-

Code:- const existingPdfBytes = await fs.readFile('./src/assets/RebateDeclaration-Template-2.pdf'); console.log('PDF Loaded...'); const pdfDoc = await PDFDocument.load(existingPdfBytes, { ignoreEncryption: true });

Fails:-

Trying to parse invalid object: {"line":35,"column":17,"offset":12393}) Invalid object ref: 343 0 R Trying to parse invalid object: {"line":75,"column":17,"offset":74743}) Invalid object ref: 357 0 R Trying to parse invalid object: {"line":94,"column":17,"offset":84101}) Invalid object ref: 8 0 R Trying to parse invalid object: {"line":142,"column":17,"offset":98347}) Invalid object ref: 50 0 R Trying to parse invalid object: {"line":173,"column":17,"offset":101484}) Invalid object ref: 51 0 R Trying to parse invalid object: {"line":176,"column":17,"offset":101640}) Invalid object ref: 52 0 R Trying to parse invalid object: {"line":185,"column":17,"offset":102199}) Invalid object ref: 53 0 R Trying to parse invalid object: {"line":198,"column":17,"offset":104265}) Invalid object ref: 54 0 R Trying to parse invalid object: {"line":208,"column":17,"offset":112587}) Invalid object ref: 62 0 R Trying to parse invalid object: {"line":213,"column":17,"offset":112921}) Invalid object ref: 63 0 R