buda-base / autocomplete-prototype

prototype for an autocomplete service for BDRC
MIT License
2 stars 0 forks source link

Adjust Levenshtein distance according to string length #3

Open roopeux opened 4 months ago

roopeux commented 4 months ago

The matches are too fuzzy with short strings.

For example, the suggestions for the random string "chod " are

chod nyid nam mkha'i klong mdzod las/_bka' srung gnod sbyin btsan rgod
chen po'i bskang gsol 'dod don lhun grub/

The second suggestion should not appear, because it is at edit distance 2.

I think the edit distances should be something like this, but this has to be tested carefully when the system is more mature. Query length: 0-2 -> edit distance 0 Query length 3-8 -> edit distance 1 Query length 9+ -> edit distance 2

eroux commented 3 months ago

the problem here is the html file, if you look at the query that is received by the server, when you type chod and have some sort of spellchecking going on, the html sends "cho" to the server, hence the kind of results you're getting. It should be better with Nicolas' latest commit