Open eoinof opened 3 years ago
Hi @eoinof, thanks for sharing the error and a fix. The encoding variable does contain a list of character names. Could you read section 5.5.5 of the PDF Reference and see if you can figure out what type of font you have, and if and how your pdf deviates from the reference?
If you don't have the time, can you close this issue, because there is not much we can do without the actual PDF sample.
Hi @pietermarsman
Yes, I'm just completing a Python 3 update on a codebase.. but I plan to figure out the cause once that is done.
Eoin
Bug report
I'm seeing a crash in the latest release of pdfminer.six (20200726) with certain PDF files. Unfortunately for privacy reasons I can't share these.
The crash is caused because the 'encoding' variable in pdffont.PDFSimpleFont.init
is a list, as opposed to either a dict or a string
This is the value of 'encoding' that triggers the crash
Stacktrace:
I understand that the root cause of the issue is an incorrectly generated encoding in the spec variable, but for our purposes simply ignoring the list value is a satisfactory, if inelegant solution.. I'll update this thread once we have some time to spend understanding the root cause..
I've included an example monkey patch in case anyone else needs to resolve the issue without much effort.. Example Monkey Patch