Closed krasi0 closed 4 years ago
You could do this, but then you have to make sure to use a dictionary that contains ALL valid compound words, both in first-second and in first_second form. Otherwise any compound word that is not in the dictionary, but you are anyway trying to check/correct will not be recognized as correct word and you will get strange correction suggestions.
In the current code the idea is to always split compounds both for the dictionary and the input terms and to spell correct the parts separately.
Shouldn't the regex on https://github.com/wolfgarbe/SymSpell/blob/13bdc134573a14cf05bc06cb7817a7ce7b9a9af4/SymSpell/SymSpell.cs#L727 be changed from @"['’\w-[]]+" to @"['’\w-]+" so that combined words (which have been added to the dictionary) like decision-making or in_vitro stay together? Are there any drawbacks to that change?