smalot / pdfparser

PdfParser, a standalone PHP library, provides various tools to extract data from a PDF file.
GNU Lesser General Public License v3.0
2.3k stars 534 forks source link

Check for wrong line-endings when getting xref #635

Closed GreyWyvern closed 10 months ago

GreyWyvern commented 10 months ago

If we didn't find the xref command at the offset specified, then replace Windows \r\n line endings with Unix style \n and try again. If it succeeds, then edit the line-endings and proceed as normal. Otherwise continue on to the decodeXrefStream() method.

Fixes parsing of existing test suite file /samples/bugs/Issue95_ANSI.pdf the unit test for which would normally be passed over because of the @group linux-only flag. Remove this flag, as all assertions in the testDecodeText() function now resolve as true in any environment.

GreyWyvern commented 10 months ago

I would like to use Issue95_ANSI.pdf to create a few new assertions in testDecodeText() for my other PR #634. So if we can get this one merged first, that would be much appreciated! :)

GreyWyvern commented 10 months ago

Yes, all good. Thanks!