Open cbldev opened 5 months ago
I noticed that on some complex PDF, with tables, pdftotext produce better result than pdf-reader gem.
pdftotext
pdf-reader
pdftotext: https://www.xpdfreader.com/pdftotext-man.html
Issue in Langchainrb: https://github.com/patterns-ai-core/langchainrb/issues/682
Good results on complex PDF parsing.
Docling seems to be a better option
Actual behavior
I noticed that on some complex PDF, with tables,
pdftotext
produce better result thanpdf-reader
gem.pdftotext: https://www.xpdfreader.com/pdftotext-man.html
Issue in Langchainrb: https://github.com/patterns-ai-core/langchainrb/issues/682
Expected behavior
Good results on complex PDF parsing.