metachris / pdfx

Extract text, metadata and references (pdf, url, doi, arxiv) from PDF. Optionally download all referenced PDFs.
http://www.metachris.com/pdfx
Apache License 2.0
1.05k stars 115 forks source link

Replace pdfminer dependency with pdfminer.six #28

Closed marsam closed 3 years ago

marsam commented 6 years ago

pdfminer.six is a well maintained fork that supports python2+3 compatibility

metachris commented 3 years ago

Thanks!