Closed snomos closed 8 years ago
Adding more weight did not really change anything. Is there anything that can be done to the case handling algorithm to improve suggestions in these cases?
Given suggestion vs expected suggestion:
$ echo Kant-RV-irgi | hfst-lookup -q fstbased/analyser-fstspeller-gt-norm.hfst
Kant-RV-irgi Kant+N+Prop+Sem/Sur+Cmp-#RV+N+ACR+Cmp-#irgi+N+Sg+Nom 517,031250
Kant-RV-irgi Kant+N+Prop+Sem/Sur+Cmp-#RV+N+ACR+Cmp-#irgi+N+Sg+Nom 10517,031250
$ echo kánturvirgi | hfst-lookup -q fstbased/analyser-fstspeller-gt-norm.hfst
kánturvirgi kantuvra+N+Cmp#virgi+N+Sg+Nom 15,031221
Note the weight differences.
Fixed in hfst-ospell r4554. It actually worked as expected if you just asked for suggestions right away, but the caching of non-suggestion lookups broke it.
Example (none of the suggestions are reasonable corrections):
Here is the same input with initial cap (second and third suggestions are reasonable corrections):
And the same input with all lowercase (all suggestions are reasonable corrections):
It might be that this can all be corrected in the fst by giving higher weights to certain types of compounds. As for now, one of the uppercase only suggestions is analysed as follows:
Giving higher weight to +ACR tags should help improve the suggestions. I'll try this first.