Open user202729 opened 3 years ago
Perhaps that issue isn't very common, because briefs use uncommon sounds and plover have automatic suffix folding. However it is a problem in some cases (member -> memes, thinks -> this, bringing -> brig, minute -> mince)
Most of the time orthographic rules can handle the suffix (hand/*L, continue/ous), but sometimes it can't (rid/le, second/ry, element/ly).
Some other times it conflicts with some other words (help → helper, hepper)
For some reason "briefed" and "bereaved" are both not in the frequency list. Sorting by stem frequency as a second criteria works. (implemented)
Extension: fix/fiction, fixes/fictions.
Or sometimes the (/i/ - y -> EU) is applied, but (/i/ - i -> AOE) is used for the plural form, which is inconsistent. (gypsy, gypsies)
[ie] -> [EU] is not good, for cases like griff/grief.
There are words in the lemmatization file but not in the dictionary file (griefs), but then they can simply be added to the dictionary.
With the new disambiguation feature, it may become worse. (WRAOEUT -> wright, WRAOEUGT -> wrighting) (currently out-of-order suffix is only supported by plover's combining suffix keys)
Or KAR -> car, KAR/AES -> carr's. (with KARZ being car's)
Similarly, with the new full-briefs (completely-compatible) dictionary added, currently *EPLT
maps to "element", but *EPLTS
maps to "empts". (fixed)
Plover's current behavior prefers non-suffix to suffix, then maximum-matching.
Which means if (A, B, A/B) are in the dictionary, then A/B-S translates to (A/B) + -S; however if B-S is also present then it translates to A + (B-S).
This behavior is supposedly not very desirable. Besides, the current n-gram handler should be able to process them.
While separate suffix strokes is not a problem,
if BREEF is "brief" then BREEFD should be "briefed"
Fortunately there are only 3~4 combining suffix. (4 if -L -> ly is counted)