Closed WPCleaner closed 5 years ago
Hmm, strange. Could you test with current master version? Just guessing but the following change in Pdf2Dom might help:
In PDFBoxTree,java
at line 402 (here) try to change it from
if (formResources != null)
to
if (formResources != null && formResources != resources)
This is just a guess so I don't want to commit it to the repo but it's worth trying.
As I'm using a class derived from PDFBoxTree
for my purpose (trying to convert PDF into something usable with Angular, not directly HTML), I've done the following tests :
Copy/Paste PDFBoxTree.updateFontTable()
and PDFBoxTree.processFontResources()
in my own class: obviously, the problem is still present but occuring in processFontResources()
in my class.
Modify processFontResources()
in my class according to your suggestion: same problem, I still end up with a StackOverflowError
I then tried to add a third parameter to processFontResources()
, a Set<PDResources>
, to keep in memory which resources have been processed and exit processFontResources()
if the current resource has already been processed: same problem, I still end up with a StackOverflowError.
I then searched the source code and realized that PDFFormXObjet.getResources()
creates a new PDResources
each time it is called, so my Set<>
won't be able to detect circular references. So I modified the Set
to be a Set<COSDictionary>
and check on resources.getCOSObject()
: it worked !
So I finally tried again your suggestion but including the fact that the actual PDResources
object will be different: it worked also !
So replacing line 402 in PDFBoxTree.java
by the following line should work:
if (formResources != null && formResources != resources && formResources.getCOSObject() != resources.getCOSObject)
Great, thanks for proposing the solution. It seems reasonable; I have committed it to master.
Hello,
For one given PDF file, I have an infinite recursive call in PDFBoxTree.processFontResources() resulting in a StackOverflowError. I have several dozens of PDF files that I want to convert, but I have this problem for only one. Unfortunately, I can't share the PDF that results in a problem as it is confidential...
It's happening with the last release, 1.7.
The stack trace I get is the following :
The code I use is the following :