[Bug] Auto Mouse Layer Sticking

aus10code commented 1 year ago

Describe the bug

This is an issue I have while using @drashna's Auto Mouse Layer that I use on my 4x6 Charybdis

My mouse layer gets stuck enabled when I follow the below steps

Activate mouse layer with trackball
Hold down accordion home row modifier (any one of them)
Press any of the following dedicated mouse keys (while mouse layer is still active) from below

bool is_mouse_record_user(uint16_t keycode, keyrecord_t* record) {
    switch(keycode) {
        case KC_WWW_BACK:
        case KC_WWW_FORWARD:
        case KC_BTN1: 
        case KC_BTN2:  
        case DRAGSCROLL_MODE_TOGGLE:
        case SNIPING_MODE_TOGGLE:
            return true;
        default:
            return false;
    }
    return  false;
}

This will then cause my mouse layer to get stuck

Information

My QMK: https://github.com/aus10code/QMK-Keymap

This issue was originally discussed on the QMK Discord server. https://discord.com/channels/440868230475677696/1082712746539352164/1082712746539352164

Do the keys involved use any of the following features?

[X] Achordion (from this repo)
[ ] Auto Shift
[ ] Combos
[ ] Key Overrides
[ ] Mod-tap or Layer-tap keys
[ ] Other custom userspace code:

drashna commented 1 year ago

Specifically, the issue is that accordion.c is (recursively) calling process_record after changing the layers, but hasn't updated the layer cache.

getreuer commented 1 year ago

Thanks @aus10code for reporting this and @drashna for those details.

As a quick possible workaround, try bypassing Achordion while the mouse layer is on:

uint16_t achordion_timeout(uint16_t tap_hold_keycode) {
  if (IS_LAYER_ON(MOUSE)) { 
    return 0;  // Bypass Achordion while mouse layer is on.
  }

  return 800;  // Otherwise use a timeout of 800 ms.
}

For a deeper fix, I don't know how Auto Mouse Layer or layer caching work and need help. Achordion itself does not use or manipulate the layer state. Granted, it does recursively call process_record(), which complicates things.

Achordion appears to work as intended with other kinds of layer keys. Could you help me understand what is happening differently with Auto Mouse Layer? I see from users/drashna/pointing /readme.md that "the layer will automatically turn itself off after 650ms". I haven't found the code though that corresponds to that timer. Am I in the right place?

As a first point of troubleshooting, I suggest to confirm the Auto Mouse Layer timer is firing. I can imagine (from experience of debugging my own timers) that timer logic might get thrown off by the recursive process_record() calls or non-monotonic timestamps.

When the timer fires, I guess it calls layer_off(MOUSE). Should Achordion be doing something around its process_record() calls to ensure the layer cache is up to date? If it would help, I'm open to adding a callback to Achordion to customize what happens around these calls.

aus10code commented 1 year ago

Thanks for the bypass solution, however I don't want to bypass Achordion!

After some digging I think I found where the 650ms is being set

pointing_device_auto_mouse.h - line 34

also this may help: Auto Mouse Layer ReadMe

As a side note, I'm available pretty much anytime to help troubleshoot. I just may need some hand holding as I'm overall pretty new to the QMK environment and C as a language itself. I'm willing to screen share and test my issues out over discord (or any other software) with you if that would be helpful.

sigprof commented 1 year ago

The problem is actually not with the layer cache, but with the recursive call to process_record() sending duplicate events to any functions that are called before process_achordion() (which gets called from process_record_user()/process_record_kb().

https://github.com/qmk/qmk_firmware/blob/54634e92634f73a6d9111833adf58214cb4278c3/quantum/quantum.c#L261-L282

Notice that process_record_kb() gets called after process_auto_mouse(), therefore at the time process_achordion() gets called, process_auto_mouse() had already seen the event, and calling process_record() with the same event will cause a duplicate call to process_auto_mouse(). However, the auto mouse code uses a counter (auto_mouse_context.status.mouse_key_tracker) to track the state of various keys that should enable the auto mouse layer, therefore the extra call to process_auto_mouse() (which is done only on press, but not on release) corrupts that counter, so that it does not reach 0 when all auto mouse keys are released, and the auto mouse layer gets stuck.

getreuer commented 1 year ago

@sigprof thanks a bunch for this excellent analysis! That dissects what is going on and makes the root cause clear.

Achordion recursively calls process_record() in a few ways. I believe all of them are symmetric in sending both duplicate press and release events, except this one is only on press. It's a bit long winded to explain why this call is made, but FTR here are the details:

Suppose a tap-hold key is down but has not yet been settled (from Achordion's point of view), and the current event is a press on another key "B". The tap-hold key then becomes settled as a tap or a hold according to achordion_chord(). To invoke the effect of this tap or hold, process_record() is recursively called with a saved keyrecord_t (and symmetric press and release events for this are sent).
It is possible the tap or hold action cause a layer change. The typical reason for this is an LT key is being held. I intend Achordion to interoperate with intercepting tap hold events in process_record_user(), so maybe the user has a custom handler that changes layers.
However, the current event for the press on key "B" is still associated with the keycode from before the layer change, and that may now be stale. To fix that, Achordion calls process_record() to reprocess the event with the new layer state, then returns false to block the original event with the stale keycode from further processing.
Concerning Auto Mouse Layer, exactly like @sigprof said, the problem is asymmetrically an additional press event is sent without an additional release event.

AFAICT recursively calling process_record() is the only way to change the keycode on an event from userspace code, but I'd love a different solution to doing that.

Possible solution: Achordion could handle the press and release of "B" symmetrically, in both cases calling process_record() on it and blocking the original events. This seems good in principle. However, Achordion's state logic is nontrivial as it is, and additionally tracking "B" to its release increases the number of edge case situations that may introduce their own problems. I'll have to think through the details.

Possible mitigation: The keycode for "B" is likely still up to date most of the time. Achordion could compare the layer state and resort to recursively calling process_record() only when it has changed. This is easy to do, I'll start with that.

sigprof commented 1 year ago

“Late” layer switches are a real problem; the tap dance code does them too, and got a special piece of code inside process_record_quantum():

https://github.com/qmk/qmk_firmware/blob/54634e92634f73a6d9111833adf58214cb4278c3/quantum/quantum.c#L241-L247

But this solution is available only to the core code, and can't be used by something externally plugged into process_record_user().

Possible solution: Achordion could handle the press and release of "B" symmetrically, in both cases calling process_record() on it and blocking the original events.

Even this might not help if the layer state had actually changed, and the original invocation of process_auto_mouse() processed a different keycode. Instead of the current “B1-press, B2-press, B2-release”, you would get “B1-press, B2-press, B2-release, B2-release” in process_auto_mouse(), so you would still have unbalanced B1-press and B2-release.

Properly fixing this may require actually plugging the feature into the core and making it run earlier, so that the issue with some other functions using wrong keycodes does not happen in the first place.

drashna commented 1 year ago

Mostly just thinking, but it might be useful to use the pre_process_record stuff as returning false there would prevent process_record from being called at all.

candrews commented 5 months ago

I'm currently experiencing this issue with the sticking auto mouse layer... is there a workaround available or any other progress?

Thank you!

drashna commented 5 months ago

For reference, I'm using this ... rather invasive hack: https://github.com/drashna/qmk_firmware/commit/fd3337bb65badf009d5b5d871f2efb9f4146cdc8 Along with this: https://github.com/drashna/qmk_userspace/commit/d98481ff2d03472efde9038b694966bf4740e2f2

candrews commented 5 months ago

drashna/qmk_firmware@fd3337b

Have you submitted a PR for that upstream? I'd love to not have to carry any such patches :-)

drashna commented 4 months ago

Develop has the changes merged in, now. https://github.com/qmk/qmk_firmware/pull/23144

getreuer / qmk-keymap

[Bug] Auto Mouse Layer Sticking #32

Describe the bug

Information