metachris / pdfx

Extract text, metadata and references (pdf, url, doi, arxiv) from PDF. Optionally download all referenced PDFs.
http://www.metachris.com/pdfx
Apache License 2.0
1.03k stars 113 forks source link

JSON Output for Check Links Subcommand #47

Open ohsh6o opened 3 years ago

ohsh6o commented 3 years ago

Hello, thank you for developing such a useful tool. I would like to evaluate this tool as part of my work project (see reference GSA/fedramp-automation#130). To move forward, I would like if the --check-links --json parameters could be combined to both check all referenced links and report the results per detected hyperlink in JSON.

If this project is still maintained, I can submit a pull request. If not, I can evaluate the feasibility of that in a fork.