Closed kartikbhtt7 closed 4 months ago
Several updates were made across the OCR/florence
module, focusing on refining keyword checking and handling of distance cutoff functionalities. Specifically, the default distance_cutoff
was set to 0
in multiple files. Additionally, the check_keywords
method was enhanced to calculate and return detailed Levenshtein distance metrics, and the run_ocr
method was improved to manage cases where data extraction fails more gracefully.
File | Change Summary |
---|---|
OCR/florence/api.py |
Changed distance_cutoff value from 1 to 0 in embed function. |
OCR/florence/model.py |
Enhanced check_keywords to calculate Levenshtein distance, modified run_ocr for better error handling. |
OCR/florence/request.py |
Changed distance_cutoff default parameter from 1 to 0 in ModelRequest class __init__ method. |
In Florence's code, a shift so fine,
Distance_cutoff
turned to zero, a new design.
Keywords checked with care anew,
Levenshtein danced in text review.
OCR's path now clear and bright,
A rabbit's joy in code's delight!
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
Summary by CodeRabbit
New Features
run_ocr
method to handle cases where data extraction is enabled but no data is found, returning structured results.Improvements