OCR-D / ocrd_tesserocr

Run tesseract with the tesserocr bindings with @OCR-D's interfaces
MIT License
38 stars 11 forks source link

Recognize with padding #126

Closed bertsky closed 4 years ago

bertsky commented 4 years ago

This was on our to-do list for a long time – I don't know just why we did not just do it.

This is for cases where the line polygon is quite close to the foreground and there is no AlternativeImage with some padding around the margins (as ocrd-cis-ocropy-dewarp would yield). AFAICT Tesseract needs some margin (I'd say at least 5px) for good recognition.

codecov[bot] commented 4 years ago

Codecov Report

Merging #126 into master will decrease coverage by 0.83%. The diff coverage is 10.34%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #126      +/-   ##
==========================================
- Coverage   37.09%   36.26%   -0.84%     
==========================================
  Files           9        9              
  Lines         965      990      +25     
  Branches      214      218       +4     
==========================================
+ Hits          358      359       +1     
- Misses        543      565      +22     
- Partials       64       66       +2     
Impacted Files Coverage Δ
ocrd_tesserocr/recognize.py 46.60% <10.34%> (-3.76%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 8aa901c...a760af4. Read the comment docs.

kba commented 4 years ago

Looks good! Shall I release 0.8.3?

No need, you are owner of the project on PyPI, so release as you see fit :)

bertsky commented 4 years ago

No need, you are owner of the project on PyPI, so release as you see fit :)

damn, now I got the order confused. (PyPI release before merge/GH release.)