empira / PDFsharp-1.5

A .NET library for processing PDF
MIT License
1.27k stars 588 forks source link

Incorrect value of PDF boolean object #53

Open uzair08inator opened 6 years ago

uzair08inator commented 6 years ago

I am writing a PDF file using PDFSharp. For some reason the value of a boolean object is written as 'False' instead of 'false' (notice the upper case 'F')

As a result while i am reading the file again i am getting following error in PDFSharp "Unexpected token 'False' in PDF stream. The file may be corrupted. If you think this is a bug in PDFsharp, please send us your PDF file."

PDFSharp version: Assembly PdfSharp.dll, v1.50.4740.0

TH-Soft commented 6 years ago

If you think there is a bug in PDFsharp then please use the IssueSubmissionTemplate to make the issue replicable.
http://www.pdfsharp.net/wiki/IssueSubmissions.ashx

Thanks.

uzair08inator commented 6 years ago

To reproduce the error all the code required is following:

PdfDocument pdfDoc = PdfSharp.Pdf.IO.PdfReader.Open("file.pdf"); pdfDoc.Save(fileCopy.pdf"); pdfDoc = PdfSharp.Pdf.IO.PdfReader.Open("fileCopy.pdf");

In line 3 i am getting error: "Unexpected token 'False' in PDF stream. The file may be corrupted. If you think this is a bug in PDFsharp, please send us your PDF file."

All you need is the PDF sample that i am using. But unfortunately i cant share the PDF sample with you because it is a client file.

Is there any other way?

TH-Soft commented 6 years ago

Without "file.pdf" I cannot replicate the issue. There are many PDF files that do not show this problem.

Please let us know when you find a non-confidential PDF file that allows to replicate the issue.

REDECODE commented 4 years ago

Same for me (but with "True" instead of "False").. even if I not modify nothing in the PDF just open and save the result PDF load in Google Chrome but not in Acrobat Pdf Reader (error).

Like @uzair08inator say if i open the result pdf with notepad++ i can see a row with a "True" and if i modify in "true" and save than I can open the pdf with Acrobat Reader correctly.

I cannot share the PDF but I can show you some rows with the "True" (the 4th line) instead of the correct "true":

endobj 62 0 obj True endobj 63 0 obj << /AP << /N 297 0 R

/F 4 /MK <<

/P 25 0 R /Parent 9 0 R

REDECODE commented 4 years ago

Ok, I solved modify PdfWriter.cs on line 105 making it similar to line 115.

Replace:

WriteRaw(value ? bool.TrueString : bool.FalseString);

with:

WriteRaw(value ? "true" : "false");

jwarner00 commented 1 year ago

This reproduces with the attached pdf file, also available here: https://www.irs.gov/pub/irs-prior/f1095c--2021.pdf f1095c--2021.pdf

A full fix would include changes to Lexer.cs to include cases for both casings of true and false (line 360), as well as modifications to the four places the code references bool.TrueString / bool.FalseString.