nosia-ai / nosia

Nosia is a platform that allows you to run an AI model on your own data. It is designed to be easy to install and use.
https://nosia.ai
MIT License
10 stars 0 forks source link

Improve PDF parsing #21

Open cbldev opened 6 days ago

cbldev commented 6 days ago

Actual behavior

I noticed that on some complex PDF, with tables, pdftotext produce better result than pdf-reader gem.

pdftotext: https://www.xpdfreader.com/pdftotext-man.html

Issue in Langchainrb: https://github.com/patterns-ai-core/langchainrb/issues/682

Expected behavior

Good results on complex PDF parsing.