metachris / pdfx

Extract text, metadata and references (pdf, url, doi, arxiv) from PDF. Optionally download all referenced PDFs.
http://www.metachris.com/pdfx
Apache License 2.0
1.03k stars 113 forks source link

Detects pdf URLs that end with parameters (e.g. ?dl=1 on dropbox) #34

Closed daviddekoning closed 3 years ago

daviddekoning commented 5 years ago

Hi Chris,

This is a small change to correctly identify pdf links if they have a set of parameters after the ".pdf".

cheers, d

metachris commented 3 years ago

Thanks 👍