Closed jahorton closed 1 month ago
Test specification and instructions
17.0.296 (0.10960.10629)
TEST_RAPID_TYPING_STABILITY (FAILED): Tested with the attached PR build (Keyman 17.0.271-beta-test-10960) on an Android 13 Mobile device and here is my observation: 1. Opened the Keyman In-app. 2. Rapidly typing on OSK leads to stuck-key-highlighting issues. 3. Tested the same on an Android 14 / API 34 emulator and I was able to reproduce it.
I didn't see anything about an Android error this time, which is a plus. Though... that might be due to the more common error that surfaced. It'd surface quickly, so if the Android error never super-consistent to begin with, it may have just "not happened (yet)".
Turns out that I made a fairly simple mistake in one of the event handlers - I forgot that ending touches don't show up in their event's .touches
property - only within the .changedTouches
property. Our unit-test event simulation didn't meet that aspect of the spec and so failed to catch this detail.
tl;dr: The error I found when I went looking quite naturally made the last key touched act "sticky", among other things. I've fixed it and made it easier for unit tests to catch the same mistake in the future.
@keymanapp-test-bot retest all
..Android 5.0 / API 21 emulator
..Android 7.1.1 / API 25 emulator
..Android 9.0 / API 28 emulator
..Android 14.0 / API 34 emulator
..Android 13.0 / Samsung Galaxy A23 Mobile device
Sentry Issue: KEYMAN-WEB-HY
Sentry Issue: KEYMAN-WEB-HX
Same behavior, older build. Not sure why I didn't see this in Sentry earlier today, but at least it's here now...
About those two Sentry reports: yay, the old error types are dead, and it's something new now - and it's much simpler to resolve and to reason about than the old patterns.
I can also see why the user experience wasn't affected by the errors that arose; the new pattern established in this PR makes it clear (to me) that those errors are triggered from already-completed gestures. Still, it's best if we don't even have errors to report, yeah?
@keymanapp-test-bot retest TEST_RAPID_TYPING_STABILITY
One more time - let's aim for zero Sentry error reports this round.
TEST_RAPID_TYPING_STABILITY (PASSED): Retested this issue with the attached PR build (Keyman 17.0.285-beta-test-10960) in an Android 13 Mobile device and here is my observation: 1. I was able to reproduce the keyboard error message after typing rapidly using the OSK for a long time. (approx. 4 minutes) . 2. However, I was not able to reproduce it If I rapidly typed 10 random keys. (as it mentioned in the test steps). Seems to be working fine.
[...] after typing rapidly using the OSK for a long time. (approx. 4 minutes) [...]. Seems to be working fine.
Well... if nothing else, that's certainly quite rare, and is a massive improvement on what we previously had.
Looking at Sentry, I'm not seeing any new error reports - the only stuff I see is from the previous user-test run. Nothing new in Web or under Android within the test
environment sets.
I do feel like sometimes stuff didn't show up for a little while, so I'll try to check back later in case there's some sort of small delay.
Yes it's an improvement, but a keyboard that crashes every 4 minutes is not exactly stable.
Yes it's an improvement, but a keyboard that crashes every 4 minutes is not exactly stable.
I agree... though it's hard to know what to address at the moment. Unlike prior user test runs, I'm not seeing any Sentry events to follow up on. Kinda need those to better address things and find the root sources.
I think sometimes error reports might have been withheld after one user test run and "released" somehow on the next? There have been times where I went looking for Sentry events and didn't see them... but after a user-retest, errors that should've been there to begin with were suddenly visible. I'm still not 100% sure on how or why that was happening.
Maybe, maybe rerunning the test for a bit might allow error reports through. This is just a feeling / suspicion though, and isn't a guarantee. It's certainly bugging me that we're getting error toasts that don't send Sentry error reports to match.
..Test_Alternating_Shift_and_Key
https://github.com/keymanapp/keyman/assets/19683143/fa884651-dac8-4a29-8132-a87bb115726a
..Flick_Locking
The following tests failed due to the previously mentioned issue. Test_Basic_Modpress Test_Basic_Modepress_Hold Test_Numeric_From_Shift Test_Delayed_subkey Test_Custom_Multitap_Modifier Test_Alternating_Shift_And_Key Test_Modpress_Multitap_Flick_Preview Test_Flick_During_Modpress Test_Flick_Basics Test_Flick_Correction Test_Flick_Locking
I've been noticing a few gesture things along the lines of the failed test recently too on my personal iPhone... but hadn't realized it was happening that extensively. Interestingly, I'm not seeing some of the issues in iOS Simulator.
Did some digging via TestFlight and found that 17.0.261-beta seems to be comparatively fine, with a number of breaks happening in 17.0.262-beta.
... and after a bit of investigation, for iOS, the stuff I noticed is due to context issues that resolve with #10956. That doesn't exactly address the Android bits noted here, though. (17.0.262 did involve some change to iOS's context handling, so it does track somewhat.) Context resets generally involve layer resets, after all, so gestures involving layer manipulation could definitely be affected by context issues.
@keymanapp-test-bot retest TEST_GESTURE_REGRESSIONS
I've gone ahead and merged in recent beta
changes to this PR. I've also tested a local build of the resulting artifact locally on a (slow) Android test device here - so far, the only issues I have seen appear to be due to lagginess. It appears that aspect is due to certain calculations needed during a modipress.
For any tests that fail, please describe the nature of the test failures for each. Please make it explicitly clear if the reason for failure is that things are happening correctly, but too slowly.
"Certain calculations" - we rely on key layout information from the DOM when noting the 'item' for each ongoing touchpath. To get this for layers currently not visible, we make them visible temporarily, which triggers layout reflows - and that appears to be a major contributing factor to lagginess on the local test device.
..Test_Alternating_Shift_and_Key
https://github.com/keymanapp/keyman/assets/19683143/fa884651-dac8-4a29-8132-a87bb115726a
I see the video, but I'm not clear on the reason this was labeled as a failure. Is it due to each tap taking a while to process, rather than executing quickly? The individual H
outputs do have a notable delay between each, and that delay is longer than I'd expect you to be intentionally doing when following the test instructions. It'd help to have that "spelled out" explicitly, as right now, all I can do is make my best guess as to your reasoning.
"Certain calculations" - we rely on key layout information from the DOM when noting the 'item' for each ongoing touchpath. To get this for layers currently not visible, we make them visible temporarily, which triggers layout reflows - and that appears to be a major contributing factor to lagginess on the local test device.
Hmm, could we cache this information when the keyboard is first loaded? (Perhaps being smart enough to think about rotation as well?) This may not be a 17.0-beta change but I'm all for avoiding layout reflows during processing!
In regard to that last point...
Drilling down the other high-percentage contributors reveals similar figures - a lot of time was spent on layout-reflow. I may want to spend some time seeing if that can be optimized.
Interestingly, I don't see anything even close to this level of lagginess within the iOS app/keyboard; maybe Safari webviews are caching layout calculations in a manner that Chrome webviews aren't?
Drilling down the other high-percentage contributors reveals similar figures - a lot of time was spent on layout-reflow. I may want to spend some time seeing if that can be optimized.
Sounds good to me! Let's make sure we get the other beta PRs merged before you get too deep into this one though.
Interestingly, I don't see anything even close to this level of lagginess within the iOS app/keyboard; maybe Safari webviews are caching layout calculations in a manner that Chrome webviews aren't?
Potentially -- but also iPhones are much faster hardware so it may just be much less visible.
"Certain calculations" - we rely on key layout information from the DOM when noting the 'item' for each ongoing touchpath. To get this for layers currently not visible, we make them visible temporarily, which triggers layout reflows - and that appears to be a major contributing factor to lagginess on the local test device.
Hmm, could we cache this information when the keyboard is first loaded? (Perhaps being smart enough to think about rotation as well?) This may not be a 17.0-beta change but I'm all for avoiding layout reflows during processing!
I could see caching each layer's layout info as it's loaded. That would be well within reason. (Then dumping the cache on a rotation or keyboard size-shift.)
Drilling down the other high-percentage contributors reveals similar figures - a lot of time was spent on layout-reflow. I may want to spend some time seeing if that can be optimized.
Sounds good to me! Let's make sure we get the other beta PRs merged before you get too deep into this one though.
Agreed. If this is the only real source of problems at present, that can certainly be done separately, allowing this chain to merge.
Just wondering: can we not calculate the regions without doing layout at all? We have all the numbers in the touch layout data and we have the bounding box size?
Just wondering: can we not calculate the regions without doing layout at all? We have all the numbers in the touch layout data and we have the bounding box size?
At the very least, we can approximate it with that data - and we might even be able to perfectly match it. Sounds like a wise optimization path to investigate.
I'm not sure why KMW is 3KB larger -- there's not that much new code?
This is 6 PRs in on the 🪠chain. This is the one that pushes things over on the filesize warning, I guess.
https://github.com/keymanapp/keyman/assets/19683143/8b03ab00-1618-4ca8-bdf2-d00d90578dbb
The remaining issues seem to mostly be performance-related. There is one notable, though ambiguous, functionality point of failure... but this does resolve a lot of far, far worse behaviors and will let us get beta
to a better state. As such, @mcdurdin and I have made the decision to go ahead and merge this PR and its predecessors. (The next one in line is still pending review.)
Changes in this pull request will be available for download in Keyman version 17.0.301-beta
Continues from #10843.
Despite my efforts to avoid it, it appears necessary to retool the event engines to ensure a proper linkage between original event identifier and the corresponding
GestureSource
that will be generated and updated. The previous "identifier" to "source" mapping style appears to be too "loose" due to asynchronicity - it's time to update to a strategy that can directly use closure-capture mechanics to ensure proper binding.Extra details: By default, browsers will reuse touch identifiers after they are freed - identifiers aren't unique. If this replacement happens while related events are deferred, we get problems like what can be seen in #10843's user test reports. Thus, it's necessary to establish the link between identifier and source while the identifier is still current for the touchpath represented by the source, then "lock in" that link - something that closure-capturing is well-suited for.
As the actual construction is queued for later, the best way I can see to leverage closure-capture mechanisms to resolve the issue is to creating Promises for the
GestureSource
synchronously during the actual touchStart handler - not within its closures. Resolution of thePromise
s can occur within closures, but assignments leveraging thePromise
need to be maintained synchronously, as it can then be accessed synchronously within the other events for closure-capture in their handlers. (After all, under extreme rapid-typing situations, it's quite possible thePromise
will only be fulfilled long after the fact, but this Promise will always be available, unambiguously, in time with the original events.)User Testing
TEST_RAPID_TYPING_STABILITY: Using the Keyman for Android app, attempt to reproduce #10592 and #10646.
typing rapidly should not skip keys
; just make sure it's something you feel comfortable typing rapidly.TEST_GESTURE_REGRESSIONS: Run the full set of Web's gesture-related regression tests and report back any errors.
TEST_RAPID_TYPING_STABILITY
to that set of tests.