Closed tiff closed 5 years ago
Also see #725. Daniel improved the suggestion mechanism in summer 2017 (in response to my request back then), and this was a huge step forward. The gist of the solution was to take the suggestions for compounds from a static, finite but large list of words that "make sense" to humans. It worked out well. But if the most likely suggestion is not in the list ("Dampfschifffahrtskapitän" with three f in this case), some algorithm is used that builds the compounds for suggestions on-the fly and fails miserably most of the time. For unknown but correct or almost correct words, we often suggest utter nonsense. For example, I got the suggestions "Aluminiumwitwenkabel, Aluminiumkatzenkabel" for "Aluminiumlitzenkabel" today. These suggestions are of course valid compounds, but what do widows have to do with aluminum cables? They look semantically weird.
I did some analysis: Dampf, schiff, ahrts, kapitän
is one of several splits, but ahrts
doesn't get the suggestion fahrts
(in CompoundAwareHunspellRule#getCandidates()
), as we use the standard suggestion algorithm, and that's not prepared to work well with in-compound words with the infix-s. Maybe we can find a hack to improve this.
Fixed with a rather specific hack (but not just hard-coding this word).
Not really common but I just wanted to show the capabilities of LanguageTool to someone and demoed it by using this word...
Correct word would be "Dampfschifffahrtskapitän"