TinoDidriksen / spellers

Front-ends and packaging scripts for spellers. Git read-only mirror.
GNU General Public License v3.0
1 stars 0 forks source link

Duplicate suggestions when input has initial upper case, suggestion can be both a name and a regular noun #15

Open snomos opened 8 years ago

snomos commented 8 years ago

I have run a comparison of the voikko-based speller with the hfst-ospell-office speller. The result is visible here (first 7 diffs, available for a month):

https://www.diffnow.com/?report=5m6dx

As can be seen in the seventh diff, the same suggestion is given twice. This is caused by two suggestions that are underlyingly different only in their initial case (but thus still different), but which are made identical because the input has initial uppercase, and so both suggestions will have initial upper case, which makes them identical.

There needs to be a check for uniqueness within the final suggestion list, and if two identical suggestions are found, only the first/best one is returned.