Closed rklec closed 1 month ago
Hi @rklec it's going to be complicated to help you without the document...
Can you try with the latest version of PdfPig (pre-release 1.9.0, available via Nuget packages)?
I'm running into this issue as well with the attached document. If I set SkipMissingFonts to true, the above exceptions gets thrown. When that option is not specified, I get the following exception instead: ErcotFacts.pdf
at UglyToad.PdfPig.Util.DictionaryTokenExtensions.GetNameOrDefault(DictionaryToken dictionaryToken, NameToken name)
at UglyToad.PdfPig.PdfFonts.Parser.Handlers.Type0FontHandler.ParseDescendant(DictionaryToken dictionary)
at UglyToad.PdfPig.PdfFonts.Parser.Handlers.Type0FontHandler.Generate(DictionaryToken dictionary)
at UglyToad.PdfPig.PdfFonts.FontFactory.Get(DictionaryToken dictionary)
at UglyToad.PdfPig.Content.ResourceStore.LoadFontDictionary(DictionaryToken fontDictionary)
at UglyToad.PdfPig.Content.ResourceStore.LoadResourceDictionary(DictionaryToken resourceDictionary)
at UglyToad.PdfPig.Content.BasePageFactory`1.Create(Int32 number, DictionaryToken dictionary, PageTreeMembers pageTreeMembers, NamedDestinations namedDestinations)
at UglyToad.PdfPig.Content.Pages.GetPage[TPage](IPageFactory`1 pageFactory, Int32 pageNumber, NamedDestinations namedDestinations, ParsingOptions parsingOptions)
at UglyToad.PdfPig.Content.Pages.GetPage(Int32 pageNumber, NamedDestinations namedDestinations, ParsingOptions parsingOptions)
at UglyToad.PdfPig.PdfDocument.GetPage(Int32 pageNumber)
at UglyToad.PdfPig.PdfDocument.<GetPages>d__34.MoveNext()
at System.Collections.Generic.LargeArrayBuilder`1.AddRange(IEnumerable`1 items)
at System.Collections.Generic.EnumerableHelpers.ToArray[T](IEnumerable`1 source)
at System.Linq.SystemCore_EnumerableDebugView`1.get_Items()
Any help with a fix for this would be greatly appreciated!
The linked ErcotFacts.pdf
does not throw for me, surprisingly, though. (Encdoded and decoded in a mail, though)
Hi @rklec should have clarified, but the exception I'm seeing occurs when calling the GetPages()
method.
For example:
using PdfDocument? document = PdfDocument.Open( stream );
if ( document is null )
{
_logger.LogWarning( "Failed to open PDF document" );
return result;
}
foreach ( var pg in document.GetPages() )
{
_logger.LogInformation( "Processing page {PageNumber}", pg.Number );
}
thanks for sharing the document, I've created a PR that fixes the issue when SkipMissingFonts = true
Much appreciated @BobLd
STR
PdfDocument.Open(pdfBytes)
with the some PDF file. As it contains sensitive data, i unfortunately cannot attach it here and I was unfortunately unable to create a minimal example, but some hints:Much like this and I tried to reproduce it with this example, but it does not work:
Thus, i only attach this image, because with the PDF I've created it is not reproducible.
What happens
Apparently, this is the line of failure: https://github.com/UglyToad/PdfPig/blob/a99c0d25bfe76e4e7a919a42c52c99022ac769d3/src/UglyToad.PdfPig/PdfExtensions.cs#L24
What should happen
At least
PdfDocumentFormatException
if you consider the file invalid.However, IMHO, the file is valid an can be opened with both Adobe Acrobat Reader and Firefox. Thus, actually parsing it would be good.
Also, when opening it with Adobe Acrobat Reader and re-saving it, it can be parsed!
System
PDFPig 0.1.8 reproducible on Windows 10
Interne Referenz: 2118