empira / PDFsharp-1.5

A .NET library for processing PDF
MIT License
1.28k stars 588 forks source link

Read and correct corrupt PDFs with incorrect stream lengths #95

Open bcallaghan-et opened 5 years ago

bcallaghan-et commented 5 years ago

If a stream specifies a length that does not match its actual value, PDFsharp would previously read past the "endstream" marker and fail to parse the rest of the file. This change allows PDFsharp to read the stream properly, even if the length is incorrect. The correct length is then written back into the stream's dictionary. Much of these changes come from @MLaukala who originally wrote the read stream logic.