keymanapp / keyman

Keyman cross platform input methods system running on Android, iOS, Linux, macOS, Windows and mobile and desktop web
https://keyman.com/
Other
392 stars 109 forks source link

bug(common/models): corrections partially broken #4743

Closed jahorton closed 3 years ago

jahorton commented 3 years ago

Describe the bug A spot of aliasing with "keep" suggestions is causing unexpected side-effects within the context-tracking system part of the lm-layer. The side effect that results is, unfortunately, quite nasty; certain correction types (like transpositions!) are completely broken as a result. (Rule of thumb: the 'graph edge' leading to the correction must have been computed in a previous pass, before the aliasing was able to affect the results.)

I've traced the issue to these two blocks:

https://github.com/keymanapp/keyman/blob/6c9fe2c9fe3cedfc9917c59867a4e83a8d300c88/common/predictive-text/worker/model-compositor.ts#L342-L346

In the block above, one of the suggestions was actually using the same Transform instance as the keystroke that triggered the prediction. Modifying that instance thus modified the keystroke's Transform... which naturally led to nasty side-effects.

https://github.com/keymanapp/keyman/blob/6c9fe2c9fe3cedfc9917c59867a4e83a8d300c88/common/predictive-text/worker/model-compositor.ts#L292-L300

The block above is the culprit.

To Reproduce Seriously... just type a keystroke whenever a lexical model is active. A breakpoint near the beginning of the method and at the end of the method will provide sufficient comparison.

mcdurdin commented 3 years ago

To Reproduce Seriously... just type a keystroke whenever a lexical model is active. A breakpoint near the beginning of the method and at the end of the method will provide sufficient comparison.

Could you give an example with a specific model, expected and actual results please? This is a little too abstract right now.

jahorton commented 3 years ago

One quick and easy one: with English active, type "bxo".

An obvious expected correction: "box".

Actual results: sometimes "BDO" shows up; no other suggestions whatsoever appear.

I designed the thing to do transpositions (reordering the 'x' and 'o'), so that's really concerning to me. I could understand if the transposition were overridden by higher probability suggestions, but not... just not appearing.

jahorton commented 3 years ago

After further investigation, I can see that a few effects are contributing to this:

  1. Right now, the graph's edges only construct search paths that have at least one lexical entry with a matching prefix. 'bx' isn't exactly a viable word-starting letter combination in English.
  2. That said... the search should at least be attempting 'b~x~o' (deleting the 'x'), as that should be relatively low-cost to perform. It's often not attempting this... due to overly zealous thresholding, it seems. This one should be easily addressable at this time.