JSv4 / OpenContracts

Mass document analytics platform based on LlamaIndex, Pgvector, React and Django.
https://JSv4.github.io/OpenContracts/
Apache License 2.0
671 stars 53 forks source link

Dynamically Apply OCR, Improve PDF Utilities and Tests #167

Closed JSv4 closed 1 month ago

JSv4 commented 1 month ago

Improve PDF utility tests

Added a check to the nlm ingestor parsing step to dynamically apply OCR only where a pdf is NOT already OCRed.

codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 80.95238% with 4 lines in your changes missing coverage. Please review.

Project coverage is 71.44%. Comparing base (6f53378) to head (d972338). Report is 5 commits behind head on main.

Files Patch % Lines
opencontractserver/utils/pdf.py 78.94% 4 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #167 +/- ## ========================================== + Coverage 70.44% 71.44% +1.00% ========================================== Files 59 59 Lines 2700 2718 +18 ========================================== + Hits 1902 1942 +40 + Misses 798 776 -22 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.