divvun / libdivvun

lib for running gramcheck and other pipelines + cli; modules for CG→spelling, CG→feedback, tagging blanks
https://giellalt.github.io/proof/gramcheck/GrammarCheckerDocumentation.html
GNU General Public License v3.0
9 stars 1 forks source link

Autocorrect mode for divvun-checker #37

Open snomos opened 4 years ago

snomos commented 4 years ago

For automatised tests where we only care about whether we got the expected output, it would be nice to have an autocorrect mode. That is, the tool should take the input text, find possible errors, apply the best/first suggestion in each case, and print out the corrected text. The output should be identical to the input text except for the corrections made.

One could consider a sub-option for whether to do single-pass or multipass corrections, ie whether to try to find more errors after the first pass until no more errors are found. Default should be single-pass.

unhammer commented 4 years ago
git fetch
git checkout 37-autocorrect 
make -j4 
src/divvun-suggest --autocorrect test/suggest/generator.hfstol test/suggest/errors.xml < test/suggest/input.badjel.cg 

gives sáddejuvvot báhpirat interneahta bokte.

(json was {"errs":[["badjel",33,39,"lex-bokte-not-badjel","boasttut sátni",["bokte"],"\"bokte\" iige \"badjel\""]],"text":"sáddejuvvot báhpirat interneahta badjel.\n") something like that?

snomos commented 4 years ago

Yes, looks good.

unhammer commented 2 years ago

Can this one be closed, or should we do a multipass / "fixed-point" option too?

snomos commented 2 years ago

It would be nice to have a multipass option too, so that the output would be "the most correct text according to the grammar checker". It should be easy to implement - just loop over the input untill there are no more changes.