Closed Jerry2001Qu closed 3 years ago
Code looks great, good job!
Please also add some short unit test in test_all.py and a short section in README, maybe # OCR
to document the feature.
Output of python test_all.py quality ; python test_all.py benchmark
before changes:
quality:
en bad: 115/400
pl bad: 23/40
tr bad: 10/27
ru bad: 1/10
uk bad: 6/12
es bad: 6/18
pt bad: 7/13
cs bad: 44/170
benchmarks:
english sentences 0.179s bad: 0/3
english sentences fast 0.001s bad: 1/3
spanish words 0.692s bad: 6/18
after changes:
quality:
en bad: 115/400
pl bad: 23/40
tr bad: 10/27
ru bad: 1/10
uk bad: 6/12
es bad: 6/18
pt bad: 7/13
cs bad: 44/170
benchmarks:
english sentences 0.179s bad: 0/3
english sentences fast 0.001s bad: 1/3
spanish words 0.708s bad: 6/18
so nothing got broken :)
Awesome, I've added a section in the README, and a unit test.
Great, thanks for contributing!
Add option to only perform replacements. Useful for cleaning OCR output, as most mistakes are replacements.