empira / PDFsharp-1.5

A .NET library for processing PDF
MIT License
1.28k stars 589 forks source link

Unexpected character '0x00ab', need contact for issue submission template #147

Closed M4nju closed 1 year ago

M4nju commented 3 years ago

Hello,

I am having a bug reading the content of a pdf document. As it is a confidential Document I am not able to post it here, but I could send it to the developers via e-mail (As mentioned in many posts on the forum). Sadly, I didn't find the contact e-mail anywhere, so maybe you can help me here. I created the Submission template mentioned at http://www.pdfsharp.net/wiki/IssueSubmissions.ashx and would be able to send it.

Actual issue: Using ContentReader.ReadContent on the first page of the document leads to the following error: "Unexpected character '0x00ab' in content stream. The stream may be corrupted, or the feature is not implemented."

Adobe Pdf can read it normally. As soon as I make changes in adobe pdf, the file is working again. I know there may be something corrupt inside the pdf file, but it would be nice to find a solution inside PdfSharp for it.

           string filePath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "Invoice.pdf");

            // Open the Pdf File
            using (PdfDocument document = PdfSharp.Pdf.IO.PdfReader.Open(filePath))
            {
                // Here the exception happens.
                CSequence contentSequence = ContentReader.ReadContent(document.Pages[0]);
            }

Any help or recommendations would be nice, thanks.

ThomasHoevel commented 3 years ago

If the file works after opening it in Adobe Reader and using File / Save as then most likely the file is corrupted. PDFsharp is not perfect at fixing corrupted files.

M4nju commented 3 years ago

Thats what i red on the forum. Thought more corrupted files where needed to make pdfsharp more "bullet proof". But well, thanks anyways :)