Open jaumeortola opened 4 years ago
Another example.
Should we implement this change, @udomai? Obviously, with sufficient testing.
Sounds good! Let's do it! We'll need a prohibited.txt to be able to exclude certain pairings if they are most likely to be a typo.
We'll need to keep an eye on how this impacts rules concerning hyphenated verb forms like "faisons-le", right?
verb forms like "faisons-le",
My proposal is for English, not French.
In French is already done this way: <token>Paris</token><token>-</token><token>London</token> and <token>faisons</token><token>-le</token>
.
There are two methods to resolve this problem:
We don't provide good spelling suggestions for words united by a hyphen (ex.
Paris-Lonton
).This is related to word tokenization. In other languages, a hyphened word that is not in the dictionary is split in different tokens, and each word gets its own suggestions.
I'm not sure if a change in tokenization like this will affect other issues.