Future-House / paper-qa

High accuracy RAG for answering questions from scientific documents with citations
Apache License 2.0
5.45k stars 517 forks source link

Fixing crash in `chunk_text` for empty file #389

Closed jamesbraza closed 4 days ago

jamesbraza commented 4 days ago

It looks like chunk_pdf handles empty PDFs by raising ImpossibleParsingError, but chunk_text didn't have the same logic. So this PR ports the logic, and expands tests to cover this edge case