rvaiya / keyd

A key remapping daemon for linux.
MIT License
3.05k stars 173 forks source link

Add a timeout that automatically resolves a tap-hold as a tap when typing #608

Open amarakon opened 1 year ago

amarakon commented 1 year ago

Using overloadt or overloadt2 causes a noticeable delay when typing. I have an idea for being able to use them without having any delay when typing. Create a global timeout (default 0) where keyd immediately registers it as a tap if a character has been tapped for less than that amount of milliseconds. For example, if the value is set to 50, and you push an overloadt or overloadt2 key, and less than 50 milliseconds has passed since the last character was tapped, immediately resolve it as a tap. This would practically eliminate the typing delay with the downside of having to wait that amount of time if you want to use the hold behaviour after typing. It also has the added benefit of significantly decreasing false positives.

nsbgn commented 1 year ago

If I understand correctly, this is possible with timeout(). For example:

f = timeout(f, 50, overloadt2(layer, f, 2000))

However, completely eliminating the visual delay without losing your balance on false positives remains difficult. (If you have the stomach for it, see also the previous discussions in #34 #81 #125 #138 #278 #309 #310 #320.)


EDIT: Wait, you started with

… registers it as a tap if a character has been tapped for less than that amount of milliseconds …

This is already possible, but you then went on with:

… milliseconds has passed since the last character was tapped …

Is what you meant here the time since keydown of the previous key, rather than the current one? If so, that may be interesting. Does the time since the keydown of an earlier key correlate with intent? It probably does somewhat, but does that mean it is a solid basis for disambiguation? I'm skeptical --- think we need some data :P

amarakon commented 1 year ago

EDIT: Wait, you started with

… registers it as a tap if a character has been tapped for less than that amount of milliseconds …

This is already possible, but you then went on with:

… milliseconds has passed since the last character was tapped …

Sorry for the contradiction. What I meant is the second one:

… milliseconds has passed since the last character was tapped …


If so, that may be interesting. Does the time since the keydown of an earlier key correlate with intent? It probably does somewhat, but does that mean it is a solid basis for disambiguation? I'm skeptical --- think we need some data :P

I think it should be the time since the keyup of an earlier key that was not used with a tap-hold modifier. That way, you can still string together keyboard shortcuts without having to wait. If a character has been pressed and released some milliseconds ago, that indicates that the user is typing. I rarely use modifier keys shortly after typing characters, there is usually a pause before I use keybindings. The only exception is shift, which is used for typing. That's the reason I don't use shift as a tap-hold modifier and instead use a dedicated key for it.

urob's ZMK config has an option called require-prior-idle-ms that does this. Additionally, you can make the feature off by default so that only people who want to use it will use it.

nsbgn commented 1 year ago

the time since the keyup

... or rather, the time since another key has been in a 'pressed' state, right? Which is 0 if it's currently still being pressed.

The more I think about it, the more it makes sense. For now I'll avoid polluting the thread with more thoughts until @rvaiya gets a chance to drop his :P

rvaiya commented 1 year ago

It has been some time since I have seriously thought about this problem, so forgive me if I ignore some aspect of it or contradict something I have previously said :P. If I recall correctly, I believe I experimented with the idea of pre-key timeouts and ultimately ended up rejecting them on the grounds that they produce too many false negatives.

Consider the following sequence which one might plausibly type in the context of a vim session:

Hello World <C-[>

Given:

a = overloadt3(control, a, 200)

where overloadt3 implements a pre-key timeout of the sort you describe

and the sequence:

<H> <e> <l> <l> <o> <Space> <W> <o> <r> <l> <d> <a> <[>

It is not possible to distinguish between:

Hello World a[

and

Hello World <C-[>

without an additional pause between d and a. This forces the user to pause in order to disambiguate the cases, which is, in my view, quite unnatural.

urob's ZMK config has an option called require-prior-idle-ms that does this.

The fact that this implemented in ZMK lends it some additional credibility, but I am curious to know how well it fares in real world long term usage.

nsbgn commented 1 year ago

Bear in mind that the purpose of the pre-timeout would only be to eliminate visual latency for overloadt and overloadt2; it would be in addition to these other mechanisms that are used to make homerow mods work. As such, the timeout can be very very short (on the order of 50ms).

Because: is the user typing normally? Then they're probably rolling: they are still, or were until very recently, holding a previous key (d in your example), and so we can immediately resolve to a tap (a). If they intended to type normally but aren't rolling (yet), all that happens is that we get some visual latency.

Has the user switched context and now wants to type a control sequence? Then I don't find it unreasonable to expect that some typists (and in particular those eccentrics who aren't bothered by other homerow mod limitations) won't need to consciously pause to avoid triggering the immediate tap --- simply 'not rolling' is enough. (as @amarakon mentioned, this expectation probably doesn't hold for shift: since that doesn't represent a context switch.)

I still don't think that you or I will be using homerow mods (so I can't produce data), but it does strike me as a strict improvement for those who do use overloadt/overloadt2 (and perhaps even overload in some cases).

This aligns with the experience in the config that @amarakon linked:

After months of tweaking, I eventually ended up with a HRM setup that is essentially timer-less, resulting in virtually no misfires. Yet it provides a fluent typing experience with mostly no delays.

rvaiya commented 1 year ago

it would be in addition to these other mechanisms that are used to make homerow mods work.

Hmm, perhaps I am missing something. If the post key timeout is maintained, isn't the visual delay still present? My understanding is that this pre-key delay is intended to obviate the need for the post key delay.

As such, the timeout can be very very short (on the order of 50ms).

Is the goal then simply to minimize the delay rather than eliminate it entirely? What about the beginning of words which start with overloaded letters? E.g how is at distinguished from <C-t> if it is the first word typed after a long pause? You would still need a reasonably long post key delay to differentiate the cases.

is the user typing normally? Then they're probably rolling: they are still, or were until very recently, holding a previous key (d in your example), and so we can immediately resolve to a tap (a).

I assume by 'rolling' you mean something like <a down> <b down> <a up> <b up>, as opposed to <a down> <a up> <b down> <b up>. The problem is that in practice you (or at least I) will use a mix of these styles, so it can't reliably serve as the basis for distinguishing between the cases (this caused a lot of accidental layer activations in my tests).

Has the user switched context and now wants to type a control sequence?

I suppose this is the crux of my argument. The internal context switch doesn't (in my experiments) necessarily translate into a consistent pause between the strokes. I have observed myself type 'C-[' in quick succession after typing a string of characters without a meaningful gap between the last letter and the control key. You can run some experiments yourself using the output of keyd monitor -t.

Then I don't find it unreasonable to expect that some typists (and in particular those eccentrics who aren't bothered by other homerow mod limitations) won't need to consciously pause to avoid triggering the immediate tap --- simply 'not rolling' is enough.

Perhaps this is true. It is certainly possible that a subset of the population naturally does this, though my suspicion is that people are just training themselves to add an additional pause to placate their trigger happy mods. In either event, I am not strictly opposed to adding the functionality if enough people find it useful.

I still don't think that you or I will be using homerow mods (so I can't produce data), but it does strike me as a strict improvement for those who do use overloadt/overloadt2 (and perhaps even overload in some cases).

Indeed :P

This aligns with the experience in the config that @amarakon linked:

I admittedly haven't read through the rationale. I will take a look.

nsbgn commented 1 year ago

If the post key timeout is maintained, isn't the visual delay still present? [...] Is the goal then simply to minimize the delay rather than eliminate it entirely?

Yes. My understanding is that a visual delay may still occur at the beginning of typing sequence, but after that, everything would show up on keydown (immediately!), as long as you keep going. Is that right?

I assume by 'rolling' you mean something like <a down> <b down> <a up> <b up>, as opposed to <a down> <a up> <b down> <b up>.

Sorry, I used it informally (without explanation...). I meant anything that 'feels' like you're typing in a steady flow, which, by grace of the proposed timeout, includes both styles. So <a down> <b down> <a up> <b up> is unambiguously rolling, but <a down> <a up> <b down> <b up> is also accepted provided that the time between <a up> and <b down> is within our small margin of tolerance.

The internal context switch doesn't (in my experiments) necessarily translate into a consistent pause between the strokes.

I suspect this is the case for me as well. (I did get curious, though, so I will shut up in this thread until I can show some experimental results :P)

amarakon commented 1 year ago

I briefly did some testing with keyd monitor -t and here is what I found.

This probably means that a value of 100 milliseconds would eliminate most of the delay while having few false negatives. However, sometimes the time between two key presses will be more than 100 milliseconds so it will not completely get rid of the delay. In addition, I often have short pauses when typing. I have a tendency to type in bursts. This means that in the beginning after each of these pauses, there will always be a delay.

nsbgn commented 1 year ago

Is that the non-overlapping time between keystrokes? (Ie <a down> <ctrl down> <a up> <ctrl up> should record an idle time before ctrl of 0, while <a down> (100ms) <a up> (50ms) <ctrl down> <ctrl up> should record 50.)

amarakon commented 1 year ago

Here is an example output:

device added: 2333:6666 ydotoold virtual device (/dev/input/event18)
device added: 0002:000a TPPS/2 IBM TrackPoint (/dev/input/event17)
device added: 0002:0007 SynPS/2 Synaptics TouchPad (/dev/input/event16)
device added: 0fac:1ade keyd virtual pointer (/dev/input/event8)
device added: 0fac:0ade keyd virtual keyboard (/dev/input/event7)
device added: 0001:0001 AT Translated Set 2 keyboard (/dev/input/event3)
+620175364 ms   keyd virtual keyboard   0fac:0ade   m up
+36 ms  keyd virtual keyboard   0fac:0ade   leftcontrol up
+4069 ms    keyd virtual keyboard   0fac:0ade   m down
+102 ms keyd virtual keyboard   0fac:0ade   k down
+0 ms   keyd virtual keyboard   0fac:0ade   m up
+55 ms  keyd virtual keyboard   0fac:0ade   k up
+0 ms   keyd virtual keyboard   0fac:0ade   w down
+94 ms  keyd virtual keyboard   0fac:0ade   h down
+0 ms   keyd virtual keyboard   0fac:0ade   w up
+71 ms  keyd virtual keyboard   0fac:0ade   h up
+113 ms keyd virtual keyboard   0fac:0ade   k down
+36 ms  keyd virtual keyboard   0fac:0ade   j down
+0 ms   keyd virtual keyboard   0fac:0ade   k up
+77 ms  keyd virtual keyboard   0fac:0ade   j up
+51 ms  keyd virtual keyboard   0fac:0ade   f down
+60 ms  keyd virtual keyboard   0fac:0ade   h down
+0 ms   keyd virtual keyboard   0fac:0ade   f up
+70 ms  keyd virtual keyboard   0fac:0ade   k down
+0 ms   keyd virtual keyboard   0fac:0ade   h up
+64 ms  keyd virtual keyboard   0fac:0ade   k up
+25259 ms   keyd virtual keyboard   0fac:0ade   leftcontrol down
+72 ms  keyd virtual keyboard   0fac:0ade   c down
rvaiya commented 1 year ago

However, sometimes the time between two key presses will be more than 100 milliseconds so it will not completely get rid of the delay. In addition, I often have short pauses when typing. I have a tendency to type in bursts. This means that in the beginning after each of these pauses, there will always be a delay.

I think these are the two crucial points:

  1. Typing is naturally bursty and context dependent so actual speed and interkey intervals vary (I can type up to 120 wpm, but rarely do in practice when writing an email). This means your timeout is bound by your slowest real world typing speed (and will likely be quite high).

  2. Words beginning with overloaded keys typed after a long pause will necessitate the longest possible post-key timeout in order to disambiguate intent, in which case the pre-key timeout serves little purpose.

A hybrid approach involving an inactivity timeout in conjunction with a post key timeout is of course possible (perhaps ZMK does this?), but I'm not convinced it is worth all the tradeoffs.

Edit:

I skimmed through urob's ZMK notes and the ZMK documentation, and it appears that a hybrid approach is indeed taken.

My proposed implementation looks something like this:

overloadt3(<action 1>, <action 2>, <timeout>)

Where <action 1> is executed if the last key was struck more than <timeout> ms ago.

This feels a bit leaky, but would be congruent with the other overload* actions and potentially allows for other novel use cases.

A hybrid disambiguation could then be achieved thusly:

overloadt3(overloadt2(control, a, 200), a, 100)

which is admittedly a bit ugly.

Having said that, I remain sceptical about usability, and would like to hear more opinions from people who have successfully used similar approaches in practice.

amarakon commented 1 year ago

A hybrid disambiguation could then be achieved thusly:

overloadt3(overloadt2(control, a, 200), a, 100)

What would the syntax be if I wanted to do this with a combo? Let's say I wanted a combo of two letters immediately resolve as a tap if a character has been typed 100 milliseconds prior.

amarakon commented 1 year ago

The fact that this implemented in ZMK lends it some additional credibility, but I am curious to know how well it fares in real world long term usage.

It seems like this feature is also implemented in QMK. See this article.

rvaiya commented 11 months ago

What would the syntax be if I wanted to do this with a combo? Let's say I wanted a combo of two letters immediately resolve as a tap if a character has been typed 100 milliseconds prior.

Can you clarify what you mean by this? If by 'combo' you mean a chord, then this should be possible to achieve by just mapping the chord to the overload action, though I'm not sure how useful it would be.

rvaiya commented 11 months ago

I've tentatively added overloadi which is overloadt3 described above but with the first two arguments transposed. I've also added an alias called lettermod which allows the user to specify an idle and hold timeout more easily in one place (see the man page for details). Feedback is welcome.

nsbgn commented 11 months ago

So far I'm actually considering keeping it in my config, so that's a win. Homerow is still a bit much for me, but I seem to be able to stomach binding to x and ., which aren't used often and almost never at the beginning of a sequence of keys --- perfect for this use case.

Interestingly, I have no problems running the following config manually:

[ids]
045e:07a5

[global]

[main]
x = lettermod(control, x, 100, 200)
. = lettermod(control, ., 100, 200)

However, when I run it with systemctl start keyd, or systemctl restart keyd, it crashes and journalctl -u keyd reports the following:

Dec 21 18:20:09 nuc keyd[64264]: keyd: src/config.c:450: parse_fn: Assertion `*nargs < MAX_DESCRIPTOR_ARGS' failed.

This doesn't happen when commenting out the lettermod lines.

Otherwise, there are some usability improvements I can think of, but I'll discuss them in other issues (or on IRC) when I'm more confident.

Thanks again! I hope it also works for @amarakon :)

rvaiya commented 11 months ago

Interestingly, I have no problems running the following config manually: [...] Interestingly, I have no problems running the following config manually:

Did you install the latest version? The old version will produce that error if you try and use lettermod. It is worth noting that the PREFIX has recently changed and keyd is now installed in /usr/local/ by default. You might need to do something like PREFIX=/usr make uninstall first.

nsbgn commented 11 months ago

You might need to do something like PREFIX=/usr make uninstall first.

Oops. Was clearly not awake. Thanks, that was it.

amarakon2 commented 11 months ago

BTW, I'm @amarakon but on a new account because I'm currently in a different country and cannot log in without two-factor authentication. Also, sorry for not responding in a long time, I just had a surgery and am in the process of recovering.

I've tentatively added overloadi which is overloadt3 described above but with the first two arguments transposed. I've also added an alias called lettermod which allows the user to specify an idle and hold timeout more easily in one place (see the man page for details). Feedback is welcome.

Thanks, it works very well!

What would the syntax be if I wanted to do this with a combo? Let's say I wanted a combo of two letters immediately resolve as a tap if a character has been typed 100 milliseconds prior.

Can you clarify what you mean by this? If by 'combo' you mean a chord, then this should be possible to achieve by just mapping the chord to the overload action, though I'm not sure how useful it would be.

I mean a chord that would register as two individual keypresses if the user tapped a key in the previous n milliseconds. I tried the following code but it didn't work.

j+k = overloadi(j+k, escape, 200)
nsbgn commented 9 months ago

Just wanted to mention that I've had the lettermod action on my layout since December. Haven't had any misfirings because of it and it makes the visual delay much less obnoxious. I still don't think I will personally keep it forever, because the delay is of course sometimes still present, but I am now convinced that it is a useful feature.

Given that, it feels natural to also allow it on chords as @amarakon/@amarakon2 suggested above. Should make those feel snappier too. On the other hand, it's not a straightforward extension --- it would be rather subtle and require new syntax. If there's genuine demand for it, it's probably worth a separate issue.