Open Hypnobrew opened 1 year ago
If there is something in the pdfs that cannot be handled by pdf-lib, maybe it would be wise to be able to set an option to ignore these objects?
I too am having a problem very similar to this... just trying to load the PDF into pdf-lib. Here're the console.warn's and the link to the PDF's. The PDF's are fillable-forms and work fine in various tools like pdfescape.com.
(unloadable) example PDF: https://toastmasterscdn.azureedge.net/medias/files/department-documents/speech-contests-documents/speech-contest-certificate-sets/en/510d-speech-contest-certificate_ff.pdf
content size,type: 641288 application/pdf
Invalid object ref: 48 0 R
Trying to parse invalid object: {"line":51,"column":17,"offset":631781})
Invalid object ref: 2 0 R
Trying to parse invalid object: {"line":58,"column":17,"offset":633010})
Invalid object ref: 3 0 R
Trying to parse invalid object: {"line":67,"column":17,"offset":633693})
Invalid object ref: 4 0 R
Trying to parse invalid object: {"line":73,"column":17,"offset":640544})
Invalid object ref: 6 0 R
Trying to parse invalid object: {"line":76,"column":17,"offset":640685})
Invalid object ref: 7 0 R
another (unloadable) PDF: https://toastmasterscdn.azureedge.net/medias/files/department-documents/speech-contests-documents/speech-contest-certificate-sets/en/510c-speech-contest-certificate_ff.pdf
content size,type: 642929 application/pdf
Invalid object ref: 55 0 R
Trying to parse invalid object: {"line":53,"column":17,"offset":633312})
Invalid object ref: 2 0 R
Trying to parse invalid object: {"line":68,"column":17,"offset":634541})
Invalid object ref: 3 0 R
Trying to parse invalid object: {"line":73,"column":17,"offset":635335})
Invalid object ref: 4 0 R
Trying to parse invalid object: {"line":77,"column":17,"offset":642186})
Invalid object ref: 6 0 R
Trying to parse invalid object: {"line":80,"column":17,"offset":642327})
Invalid object ref: 7 0 R
Anyone find a solution here?
I'm also having this same issue:-
Code:- const existingPdfBytes = await fs.readFile('./src/assets/RebateDeclaration-Template-2.pdf'); console.log('PDF Loaded...'); const pdfDoc = await PDFDocument.load(existingPdfBytes, { ignoreEncryption: true });
Fails:-
Trying to parse invalid object: {"line":35,"column":17,"offset":12393}) Invalid object ref: 343 0 R Trying to parse invalid object: {"line":75,"column":17,"offset":74743}) Invalid object ref: 357 0 R Trying to parse invalid object: {"line":94,"column":17,"offset":84101}) Invalid object ref: 8 0 R Trying to parse invalid object: {"line":142,"column":17,"offset":98347}) Invalid object ref: 50 0 R Trying to parse invalid object: {"line":173,"column":17,"offset":101484}) Invalid object ref: 51 0 R Trying to parse invalid object: {"line":176,"column":17,"offset":101640}) Invalid object ref: 52 0 R Trying to parse invalid object: {"line":185,"column":17,"offset":102199}) Invalid object ref: 53 0 R Trying to parse invalid object: {"line":198,"column":17,"offset":104265}) Invalid object ref: 54 0 R Trying to parse invalid object: {"line":208,"column":17,"offset":112587}) Invalid object ref: 62 0 R Trying to parse invalid object: {"line":213,"column":17,"offset":112921}) Invalid object ref: 63 0 R
What were you trying to do?
Hi
Thanks a lot for a great library for working with pdfs. Usually it works fine but sometimes I am getting into trouble where the lib says it has invalid objects when loading the content.
Validating the pdf on a site like this says the file is ok.
How did you attempt to do it?
Nothing fancy, just basic loading the pdf file like:
What actually happened?
Getting errors like these:
What did you expect to happen?
Parsing without any errors.
How can we reproduce the issue?
Run a simple example with the file I provided.
Version
1.17.1
What environment are you running pdf-lib in?
Node
Checklist
Additional Notes
Here is an example file with these problems: Assa Abloy - Q3 2022 - Conference Call Deck.pdf