Open schester44 opened 2 years ago
Facing the same issue. My PDF is around 60 to 70 pages
Same issue here, is there any time frame on when this will be looked into/fixed
We are experiencing the same issue, any news on this? glad to help any way I can
Same issue, any chance to fix this soon?
no fix
We are also experiencing this issue with a specific PDF.
I also stumbled over this.
In my case, the reason was that the /Pages
dict doesn't have /Type
set to /Pages
. That caused the PDF parser to instantiate the object as a plain PDFDict
instead of a PDFPageTree
.
I was successful with the following workaround:
const pdfDoc = await PDFDocument.load(bytes)
// Find reference to the page tree
const pagesRef = pdfDoc.catalog.get(PDFName.of('Pages'))
// Get the page tree. This is a PDFDict.
const oldPageTree = pdfDoc.context.indirectObjects.get(pagesRef)
// Create a PDFPageTree with the same content.
const newPageTree = new PDFPageTree(oldPageTree.dict, oldPageTree.context)
// Set the correct `Type`.
newPageTree.dict.set(PDFName.of('Type'), PDFName.of('Pages'));
// Replace the PDFDict with the PDFPageTree in the document.
pdfDoc.context.indirectObjects.set(pagesRef, newPageTree)
// Save fixed document
...
In my case the PDFDocument.catalog property was initialised with a PDFDict instead of a PDFCatalog. So here is my workaround for the bug:
const doc = await PDFDocument.load(bytes, { ignoreEncryption: true });
if (!(doc.catalog instanceof PDFCatalog) && ((doc.catalog as any) instanceof PDFDict)) {
(doc as any).catalog = PDFCatalog.fromMapWithContext(doc.catalog, doc.context);
}
For me it wasn't working due to Catalog pointing to the wrong object. I did this to manually point Catalog to a PDFPageTree
let pdfPageTree;
for (const entry of pdfDoc.context.indirectObjects.entries()) {
const [ref, obj] = entry;
if (obj instanceof pdfLib.PDFPageTree) {
pdfPageTree = obj;
break;
}
}
doc.catalog = pdfLib.PDFCatalog.withContextAndPages(pdfDoc.context, pdfPageTree);
What were you trying to do?
I am trying to load a 90 page PDF into the lib
How did you attempt to do it?
Here is a simple reproduction of the issue
What actually happened?
I am getting the
TypeError: _this.catalog.Pages(...).traverse is not a function
error anytime I call any APIs that require traversing the pages. This includesgetPageCount
,save
, etc.What did you expect to happen?
I expected these functions to work as expected.
How can we reproduce the issue?
Run the above code snippet using node
Version
1.17.1
What environment are you running pdf-lib in?
Node
Checklist
Additional Notes
Above is the code snippet for reproducing the issue. The document is a somewhat sensitive PDF so i'd prefer to not attach it here publicly. I can attach the PDF via a DM or email.
Some more context:
This is a 90 page document (3.8MB). Opening it in Acrobat causes an error in acrobat. not sure if its related but I suspect it could be.
Here's the fun part... re-exporting this file and opening it with pdf-lib works as expected so Acrobat is doing something that fixes the issue, just not sure what and unfortunately re-exporting through acrobat isn't an option given the task.
Here to see if anyone knows what may be going on and how to potentially fix this issue. Thanks!