Closed securedoccheck closed 2 years ago
Hi Andy,
Are you able to share a file that causes this problem please? I'm assuming there's some token that we don't support properly yet but it will be file specific so hard to diagnose without the source file.
If not you can open the PDF in Notepad++ or similar and find the contents of a line like:
<</Root 22 0 R/Info 1 0 R/Encrypt 55 0 R/ID[<462216FABF3B2DFEA6DBF82A292C4BDB><CECB22F1C4A0E419EC8F65D520C078DE>]/Size 56>>
The most important parts here being /Root
and /Encrypt XX YY R
. Then you need to find the corresponding line containing XX YY obj
, so in the previous example 55 0 obj
: Then let me know the content of the bit between double angle brackets <<
:
55 0 obj<</Filter/Standard/U(tºln¼éú8äÄŸVË?YrA¢AÍÙªÎï^Aˆõ̳¢`<Åùa½âæø‡)/O(ípCeœ:qÂ*w¸>Ê÷žXýMcqOÏ[„¡jA!Ë>R2DÐ W4Y\)Ô\) )/P -1028/Length 256/R 6/EncryptMetadata true/UE(ãŸp”ÓÕÝã5y\)”¶u‡éŽ=Þ2èêÛæ"Ÿ†)/OE(ÿƒ„pïãwCƒeQŸ`µEyq¯÷A§Uß,[Û£¥)/Perms(ÚKÇ©;OWŽQHXåPI,)/CF<</StdCF<</Length 32/AuthEvent/DocOpen/CFM/AESV3>>>>/StmF/StdCF/StrF/StdCF/V 5>>
endobj
Hi Eliot,
Below are the lines from the pdf. I'll also try to attach the file in a day. I have few PII info on that which cannot be shared as it is.
trailer << /Size 18 /Info 2 0 R /Root 17 0 R /ID[ <30383134353431332d303046462d343645462d423846462d453046383732454241373438> <30383134353431332d303046462d343645462d423846462d453046383732454241373438> ] /Encrypt 1 0 R
Second part:
1 0 obj
<< /Filter /Standard
/V 1
/Length 40
/R 2
/O <2055c756c72e1ad702608e8196acad447ad32d17cff583235f6dd15fed7dab67>
/U
Second example:
<</DecodeParms<</Columns 3/Predictor 12>>/Encrypt 8 0 R/Filter/FlateDecode/ID[<43383843373044462D464130422D343836322D424546422D313645383545414139334442><849BD7D0EF733C4583465A492531B962>]/Info 6 0 R/Length 37/Root 9 0 R/Size 7/Type/XRef/W[1 2 0]>>stream
hÞbb``bœû†‰ßŽ‰¡—‰ñ;cpÍ
:™
endstream
endobj
startxref
116
%%EOF
Second part:
8 0 obj <</Filter/Standard/Length 40/O( UÇVÇ.×`Ž–¬DzÓ-Ïõƒ#_mÑ_í}«g)/P 4294967292/R 2/U(¼ÑYÍ~7…œž¾pðþƒ8ÐÿkñB²J¦ºqÈ ‰¬)/V 1>> endobj
Attaching the sample file Sample_file.pdf
@securedoccheck thanks for providing that, can you give this version a go and see if it resolves the problem? https://www.nuget.org/packages/PdfPig/0.1.6-alpha-20220111-41bfa
@EliotJones, thanks for the fix and I tested the same and it has resolved the original issue.
I was validating few other files for testing purpose but stepped on to two other issues.
at UglyToad.PdfPig.Parser.PdfDocumentFactory.ParseTrailer(CrossReferenceTable crossReferenceTable, Boolean isLenientParsing, IPdfTokenScanner pdfTokenScanner, EncryptionDictionary& encryptionDictionary) at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, ILog log, Boolean isLenientParsing, IReadOnlyList`1 passwords, Boolean clipPaths) at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options) at UglyToad.PdfPig.PdfDocument.Open(Byte[] fileBytes, ParsingOptions options)
Unrecognized encryption token in trailer: null.
Expected name as dictionary key, instead got: {ì%,©¹êjXåvFVÕXðiÙrGÍbyç3Þò
at UglyToad.PdfPig.Tokenization.DictionaryTokenizer.ConvertToDictionary(List1 tokens) at UglyToad.PdfPig.Tokenization.DictionaryTokenizer.TryTokenize(Byte currentByte, IInputBytes inputBytes, IToken& token) at UglyToad.PdfPig.Tokenization.Scanner.CoreTokenScanner.MoveNext() at UglyToad.PdfPig.Tokenization.ArrayTokenizer.TryTokenize(Byte currentByte, IInputBytes inputBytes, IToken& token) at UglyToad.PdfPig.Tokenization.Scanner.CoreTokenScanner.MoveNext() at UglyToad.PdfPig.Tokenization.ArrayTokenizer.TryTokenize(Byte currentByte, IInputBytes inputBytes, IToken& token) at UglyToad.PdfPig.Tokenization.Scanner.CoreTokenScanner.MoveNext() at UglyToad.PdfPig.Tokenization.ArrayTokenizer.TryTokenize(Byte currentByte, IInputBytes inputBytes, IToken& token) at UglyToad.PdfPig.Tokenization.Scanner.CoreTokenScanner.MoveNext() at UglyToad.PdfPig.Parser.FileStructure.FileHeaderParser.Parse(ISeekableTokenScanner scanner, Boolean isLenientParsing, ILog log) at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, ILog log, Boolean isLenientParsing, IReadOnlyList
1 passwords, Boolean clipPaths)
at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options)
at UglyToad.PdfPig.PdfDocument.Open(Byte[] fileBytes, ParsingOptions options)
For the first error, below is the data/stream from pdf:
/Type /XRef
/Root 8 0 R
/Prev 116
/Length 84
/Size 35
/W [1 3 2]
/Index [0 1 6 1 8 2 25 10]
/ID [
stream
ÿÿ ‘J ‘ô ¢& ’Ô ¡÷ ¢
¡µ £~ £— ¢ “ ¤
ž½
endstream
endobj
startxref
695997
%%EOF
For the second error:
trailer
<</Info 11 0 R/ID [
Looks like there is no encrypt section or <</DecodeParams section with in the pdf.
@securedoccheck there should be a new NuGet shipping at midnight UTC with the fix for the first of the 2 issues. On the second it looks like the error is at the file start, can you copy a few lines from the start of the file, probably looking like:
%PDF-1.6
%âãÏÓ
7 0 obj
<</Linearized 1/L 7259/O 10/E 2883/N 1/T 6916/H [ 504 132]>>
endobj
13 0 obj
If you copy down as far as the first dictionary (<< ... >>
) occurrence that looks to be the problem.
¬í ur [Ljava.lang.Object;ÎXŸs)l xp sr java.util.HashMapÚÁÃ`Ñ F loadFactorI thresholdxp?@ w t ETagt %W/"1ae56-SgR20izELlBssSALl/6suyfyfqw"t Access-Control-Allow-Credentialst truet Connectiont keep-alivet Content-Lengtht 110166t Access-Control-Allow-Headerst .Origin, X-Requested-With, Content-Type, Acceptt Datet Wed, 05 Jan 2022 10:50:21 GMTt X-Powered-Byt Expresst Content-Typet application/pdfxur [B¬óøTà xp ®V%PDF-1.4 %âãÏÓ
I can see what you are saying, it may be the issue with the way the PDF was created. I was able to read the document, if I remove the extra content before %PDF-1.4.
Thanks Eliot. Really appreciate the help!
Closing this since I think it was resolved, let me know if you encounter any issues.
Hi,
Thanks for providing us with this great tool.
I'm currently using v 0.1.5 and having an issue when we try to open certain PDF files. Most of the PDF work fine but a small number throws the below error stack.
_at System.Decimal.ToInt32(Decimal d) at UglyToad.PdfPig.Tokens.NumericToken.getInt() at UglyToad.PdfPig.Encryption.EncryptionDictionaryFactory.Read(DictionaryToken encryptionDictionary, IPdfTokenScanner tokenScanner) at UglyToad.PdfPig.Parser.PdfDocumentFactory.ParseTrailer(CrossReferenceTable crossReferenceTable, Boolean isLenientParsing, IPdfTokenScanner pdfTokenScanner, EncryptionDictionary& encryptionDictionary) at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument(IInputBytes inputBytes, ISeekableTokenScanner scanner, ILog log, Boolean isLenientParsing, IReadOnlyList`1 passwords, Boolean clipPaths) at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open(IInputBytes inputBytes, ParsingOptions options) at UglyToad.PdfPig.PdfDocument.Open(Byte[] fileBytes, ParsingOptions options)
The above error happen on the below line of code which tries to open the file using bytes or Stream.
_PdfDocument document = PdfDocument.Open(verifyFile.fileByte, parOpt);_
Please can you help look in to this. Really appreciate your help!
Thanks, Andy