freelawproject / doctor

A microservice for document conversion at scale
https://free.law/projects/doctor
BSD 2-Clause "Simplified" License
57 stars 15 forks source link

feat(text-extract): Add fix and test for recap-> op #193

Closed flooie closed 5 months ago

flooie commented 5 months ago

Strip margin fails when maps/addendums are horizontal.

Add fix and tests for these documents

sentry-io[bot] commented 5 months ago

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: doctor/lib/text_extraction.py

Function Unhandled Issue
get_page_text ValueError: Bounding box (0, 93.17647058823529, 792, 931.7647058823529) is not fully within parent page bound... ...
Event Count: 3

Did you find this useful? React with a 👍 or 👎