Closed 154192 closed 1 year ago
Thanks for raising this issue @154192. The intermediate issue appears to be that the fontnames for some some of the PDF's characters are being read as bytes — e.g., b'RGJSAP+\xcb\xce\xcc\xe5'
— instead of strings. I'm not yet sure whether this is an issue with the PDF itself or with pdfminer.six
, the library pdfplumber
uses as its PDF parser. I hope to take a closer look soon.
This should now be fixed in v0.9.0, but let me know if it's still not working for you.
300218_2011.pdf