UglyToad / PdfPig

Read and extract text and other content from PDFs in C# (port of PDFBox)
https://github.com/UglyToad/PdfPig/wiki
Apache License 2.0
1.59k stars 227 forks source link

Could not find the startxref within the last 2048 characters. #816

Open last-Programmer opened 3 months ago

last-Programmer commented 3 months ago

I am trying to open a pdf generated by crystal reports and it is giving the error

Could not find the startxref within the last 2048 characters.

We open number of pdfs generated by Crystal Reports and it all works fine. But with one specific report we are getting this error.

Unfortunately we are not able to get the pdf since it is a a step in a long process in production app.

The call stack shows this

at UglyToad.PdfPig.Parser.FileStructure.FileTrailerParser.GetStartXrefPosition(IInputBytes bytes, Int32 offsetFromEnd) at UglyToad.PdfPig.Parser.FileStructure.FileTrailerParser.GetFirstCrossReferenceOffset(IInputBytes bytes, ISeekableTokenScanner scanner, Boolean isLenientParsing) at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, InternalParsingOptions parsingOptions) at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options)

Is there a way to fix this issue.

Thank You very much in advance.

BobLd commented 3 months ago

@last-Programmer can you confirm you used the latest nightly build 0.1.9-alpha-20240402-f6292

last-Programmer commented 3 months ago

@BobLd i will try the alpha version and update you. thanks.

hoverwars commented 2 weeks ago

@BobLd Hello, I'm experiencing the same issue with a pdf that seems to be auto-generated. I'm using the latest alpha build: 0.1.9-alpha-20240702-65c64.

UglyToad.PdfPig.Core.PdfDocumentFormatException: 'Could not find the startxref within the last 2048 characters.'

Is there a way to bypass this error ?