metachris / pdfx

Extract text, metadata and references (pdf, url, doi, arxiv) from PDF. Optionally download all referenced PDFs.
http://www.metachris.com/pdfx
Apache License 2.0
1.05k stars 115 forks source link

PDF fails to open if special character in path #23

Closed oliviercailloux closed 3 years ago

oliviercailloux commented 7 years ago

$ pdfx Prés/presentation.pdf Traceback (most recent call last): File "/usr/local/bin/pdfx", line 9, in load_entry_point('pdfx==1.3.0', 'console_scripts', 'pdfx')() File "build/bdist.linux-x86_64/egg/pdfx/cli.py", line 149, in main File "build/bdist.linux-x86_64/egg/pdfx/init.py", line 99, in init UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128)

But the PDF opens fine with;: $ cd Prés $ pdfx presentation.pdf

metachris commented 3 years ago

Should be fixed now. Please let me know if it still persists with v1.4.1