filyp / autocorrect

Spelling corrector in python
GNU Lesser General Public License v3.0
447 stars 79 forks source link

add only_replacements #21

Closed Jerry2001Qu closed 3 years ago

Jerry2001Qu commented 3 years ago

Add option to only perform replacements. Useful for cleaning OCR output, as most mistakes are replacements.

filyp commented 3 years ago

Code looks great, good job! Please also add some short unit test in test_all.py and a short section in README, maybe # OCR to document the feature.

filyp commented 3 years ago

Benchmarks

Output of python test_all.py quality ; python test_all.py benchmark

before changes:

quality:
en  bad: 115/400
pl  bad: 23/40
tr  bad: 10/27
ru  bad: 1/10
uk  bad: 6/12
es  bad: 6/18
pt  bad: 7/13
cs  bad: 44/170

benchmarks:
english sentences        0.179s    bad: 0/3
english sentences fast   0.001s    bad: 1/3
spanish words            0.692s    bad: 6/18

after changes:

quality:
en  bad: 115/400
pl  bad: 23/40
tr  bad: 10/27
ru  bad: 1/10
uk  bad: 6/12
es  bad: 6/18
pt  bad: 7/13
cs  bad: 44/170

benchmarks:
english sentences        0.179s    bad: 0/3
english sentences fast   0.001s    bad: 1/3
spanish words            0.708s    bad: 6/18

so nothing got broken :)

Jerry2001Qu commented 3 years ago

Awesome, I've added a section in the README, and a unit test.

filyp commented 3 years ago

Great, thanks for contributing!