giellalt / giella-core

Build tools and build support files as well as developer support tools for the GiellaLT repositories.
https://giellalt.uit.no
GNU General Public License v3.0
7 stars 2 forks source link

Speller error model built from typos list #32

Open snomos opened 1 year ago

snomos commented 1 year ago

Just an idea:

Given a large typos list, one could imagine making an error model of it + a simple Levenshtein 1 edit distance thing on top of it.

Needs to be tested for:

If we find that it works well given a typos list of X entries, we could build it automatically if typos file ≥ X.

Main benefit: since we already collect typos, it would be an easy way to build an error model that would correct most typos without us having to do any work.