UglyToad / PdfPig

Read and extract text and other content from PDFs in C# (port of PDFBox)
https://github.com/UglyToad/PdfPig/wiki
Apache License 2.0
1.73k stars 241 forks source link

Expected name as dictionary key, instead got: #454

Closed pclancysc closed 1 year ago

pclancysc commented 2 years ago

Looks like I may have triggered a bug in the parsing of a specific PDF I need to process.

The exception is UglyToad.PdfPig.Core.PdfDocumentFormatException

The message is this.. (there is no "key", it's an empty string)

Expected name as dictionary key, instead got:

Stack trace:

 at UglyToad.PdfPig.Tokenization.DictionaryTokenizer.ConvertToDictionary(List``1 tokens)
   at UglyToad.PdfPig.Tokenization.DictionaryTokenizer.TryTokenize(Byte currentByte, IInputBytes inputBytes, IToken& token)
   at UglyToad.PdfPig.Tokenization.Scanner.CoreTokenScanner.MoveNext()
   at UglyToad.PdfPig.Tokenization.DictionaryTokenizer.TryTokenize(Byte currentByte, IInputBytes inputBytes, IToken& token)
   at UglyToad.PdfPig.Tokenization.Scanner.CoreTokenScanner.MoveNext()
   at UglyToad.PdfPig.PdfFonts.Parser.CMapParser.Parse(IInputBytes inputBytes)
   at UglyToad.PdfPig.PdfFonts.Cmap.CMapCache.Parse(IInputBytes bytes)
   at UglyToad.PdfPig.PdfFonts.Parser.Handlers.Type0FontHandler.Generate(DictionaryToken dictionary)
   at UglyToad.PdfPig.PdfFonts.FontFactory.Get(DictionaryToken dictionary)
   at UglyToad.PdfPig.Content.ResourceStore.LoadFontDictionary(DictionaryToken fontDictionary)
   at UglyToad.PdfPig.Content.ResourceStore.LoadResourceDictionary(DictionaryToken resourceDictionary)
   at UglyToad.PdfPig.Parser.PageFactory.Create(Int32 number, DictionaryToken dictionary, PageTreeMembers pageTreeMembers, Boolean clipPaths)
   at UglyToad.PdfPig.Content.Pages.GetPage(Int32 pageNumber, Boolean clipPaths)
   at UglyToad.PdfPig.PdfDocument.GetPage(Int32 pageNumber)
   at UglyToad.PdfPig.PdfDocument.<GetPages>d__32.MoveNext()
   at Program.<<Main>$>d__0.MoveNext()

Any help much appreciated, I can provide the PDF on request.

EliotJones commented 2 years ago

Hi @pclancysc if you're able to share the file please you can email it to the email mentioned in this comment please I can take a look https://github.com/UglyToad/PdfPig/issues/334#issuecomment-859037365