Open andreahorbach opened 6 years ago
Do we really do that in Escrito itself? We might first need a spelling tool that can be parametrized in those ways (stand-alone project) which is then used here.
We have normalization code that uses the escrito readers corrects spelling mistakes and writes the output, which is the used as new input in a core escrite process. So normalization is not a direct part of the escrito pipeline, but escrito provides ways for spellchecking the data.
ok, my suggestion would then be to allow Escrito to use a spell checker and keep most of the parametrization with the spell checker. Escrito only needs to decide where the spell checker is applied.
so that one can choose how the best replacement candidate for a misspelling is selected