Work with other targets than only PDF (eg. html, text, etc)

metachris / pdfx

Extract text, metadata and references (pdf, url, doi, arxiv) from PDF. Optionally download all referenced PDFs.

http://www.metachris.com/pdfx

Apache License 2.0

1.03k stars 113 forks source link

Work with other targets than only PDF (eg. html, text, etc) #7

Closed metachris closed 8 years ago

metachris commented 8 years ago

At least think about extracting PDFs from websites etc.

ghost-hacked commented 8 years ago

Could you explain further about what you want to be done?

metachris commented 8 years ago

I was thinking that if you point pdfx to a website instead of a pdf, that it also should try to extraxt links/pdfs On Oct 29, 2015 17:06, "Connor Kendrick" notifications@github.com wrote:

Could you explain further about what you want to be done?

— Reply to this email directly or view it on GitHub https://github.com/metachris/pdfx/issues/7#issuecomment-152227576.