keymanapp / keyman

Keyman cross platform input methods system running on Android, iOS, Linux, macOS, Windows and mobile and desktop web
https://keyman.com/
Other
394 stars 110 forks source link

bug(web): flicks occasionally become fully disabled #11650

Closed jahorton closed 4 months ago

jahorton commented 4 months ago

Describe the bug

While I haven't yet noted any consistent patterns that would trigger the issue, during use of the system keyboard on iOS on my personal device, the keyboard has occasionally gone completely unresponsive to flicks.

When this triggers, multitap on shift (for caps) still works fine, as does longpress. Even modipress is fine - just not flicks. Any and all flicks, however, are completely ignored until the keyboard is swapped or reselected from the globe-key menu. (Just re-selecting the active keyboard is enough.) Longpress up-flick shortcutting does not appear to trigger in this state either - only the timer component seems operable then. Layer-changing is not enough to trigger a reset - the keyboard itself must be reset. (Layer changing itself still works, to be clear.)

After checking within Sentry for potentially-related error reports near the time of the observed error, nothing even closely related could be found.

Reproduce the bug

No currently-known repro exists. It generally happens when I start using flicks frequently in succession, I think? But it triggers sporadically and randomly for me, and I haven't been able to narrow down a cause yet. I usuallly reset, then re-attempt whatever I was doing whenever it triggers, but doing so has never retriggered the issue for me.

If and when #11229 goes through, I should be able to use webview-inspection to get more insight.

Also of note: I don't recall ever having recently changed layers whenever it triggered. The most recent instance had certainly been at least 7 keystrokes after the most recent layer change, if not considerably more. (I wasn't exactly counting at the time.) I had not recently used a modipress input, either.

Expected behavior

No response

Related issues

No response

Keyman apps

Keyman version

17.0.321-beta, other previous versions. Should also trigger on 17.0.326.

Operating system

iOS

Device

iPhone SE 2

Target application

any

Browser

No response

Keyboard name

No response

Keyboard version

No response

Language name

No response

Additional context

No response

jahorton commented 4 months ago

I thought I'd found a repro for this last night, but it turned out to be due to a different behavior - modipress isn't holding consistently, with the layer dropping back while I'm still holding a modifier key. Interestingly... I don't see the same behavior when using the Web testing pages on iOS Safari, outside of the app + system-keyboard. Everything feels butter-smooth for non-embedded gestures - it's when they're embedded that we're seeing the issue.

While noting this distinction, I remembered something that has a chance of relevance for this problem. Way back in the past, with #2959, we changed the iOS keyboard engine to use the Web-based subkey menu. At the time, we hoped that what we saw as a bug would be fixed and that we'd be able to reverse the decision. You see, that bug was blocking users' ability to select longpress subkeys... because gesture handlers on the Swift side for the WebView were actually being affected by the return value provided by JS-side event handlers - something we certainly didn't expect.

A lot has changed since then, and we're now fully committed to handling all keyboard gestures fully on the JS side, within the WebView. We also never saw a bug fix go through for the issue we reported. Remembering that we've seen interaction between gesture-handlers in Swift and those within the WebView before... is it possible that the gesture issues we're seeing now are due to the Swift-side handlers we left in place? Could those be conflicting?

jahorton commented 4 months ago

OK, I've got some data via Web-inspection now, thanks to #11672.

flick-start gesture components are failing to resolve, even when related conditions appear to be met. This does match with observations, now that I think about it - the flick animation never kicks in, after all. At least now, we can confirm that the flick-start matcher is, at least, spun up.

This is because we're getting invalid flick parameterization values at runtime, somehow: <, >, and other mathematical comparison operators all fail if one of the two values for comparison are NaN. It's not clear yet what the underlying source of things going NaN is, but NaN values have been observed in the fields being set in the codeblock below.

https://github.com/keymanapp/keyman/blob/c755353d7a3eafe9adc321f083b4b0be88049bdb/web/src/engine/osk/src/visualKeyboard.ts#L1302-L1306

If this is called while there is no currentLayer... yeah, that'd do it... though how it'd get in that position is a question unto itself:

https://github.com/keymanapp/keyman/blob/c755353d7a3eafe9adc321f083b4b0be88049bdb/web/src/engine/osk/src/visualKeyboard.ts#L272-L274

layerGroup gets assigned during initialization, and activeLayer is always a valid instance unless the hosting VisualKeyboard instance is somehow bypassed:

https://github.com/keymanapp/keyman/blob/c755353d7a3eafe9adc321f083b4b0be88049bdb/web/src/engine/osk/src/visualKeyboard.ts#L252-L257

And layerGroup shouldn't ever be null or undefined.

https://github.com/keymanapp/keyman/blob/c755353d7a3eafe9adc321f083b4b0be88049bdb/web/src/engine/osk/src/visualKeyboard.ts#L351-L356

Using developer mode with touch-layout emulation at line 356, .currentLayer is already truthy, providing a valid layer instance.

Alternatively, an invalid / NaN rowHeight in the first block would also be enough to trigger the issue.