ad-freiburg / pdfact

A basic tool that extracts the structure from the PDF files of scientific articles.
Apache License 2.0
68 stars 11 forks source link

Expose an API for pdf parsing #17

Open avvertix opened 1 month ago

avvertix commented 1 month ago

Hi all, thanks for the project. Pdfact is a great step forward in extracting text from PDFs. Are you planning to accept contributions, like exposing it over a web-based API?

dnlbauer commented 1 month ago

Hi @avvertix, I'm not one of the ppl working on pdfact, but I wrote a wrapper flask app that exposes pdfact as a service. Maybe this is what you need? https://github.com/dnlbauer/pdfact-service I use it in production for some time now and it hasn't let me down yet. Check it out!

avvertix commented 1 month ago

Great @dnlbauer thanks for the pointer