UglyToad / PdfPig

Read and extract text and other content from PDFs in C# (port of PDFBox)
https://github.com/UglyToad/PdfPig/wiki
Apache License 2.0
1.73k stars 241 forks source link

PdfDocument.Open cannot finish execution, and the memory continues to increase. #519

Closed zhangbaodan closed 1 year ago

zhangbaodan commented 1 year ago

When I open the "bug_544880.pdf" file using PdfDocument.Open, the function keeps being called, cannot finish, and the memory keeps increasing. Can anyone help me solve this problem? The UglyToad.PdfPig version is 0.1.6 The file in question is:bug_544880.pdf

fnatzke commented 1 year ago

G'day @zhangbaodan

Infinite loop parsing tokens.

In \UglyToad.PdfPig\Parser**CatalogFactory.cs**

after

var current = toProcess.Dequeue();

add

if (current.reference.GetHashCode() == current.parentReference.GetHashCode()) { continue; }

zhangbaodan commented 1 year ago

@fnatzke Thank you very much. It seems to be working. I hope the community will put this change together and publish it soon.

fnatzke commented 1 year ago

Thnkas @zhangbaodan. I've checked in a revisited fix that handles your pdf as well as some others.

EliotJones commented 1 year ago

I believe @fnatzke has fixed this in the latest nightly build