Questions regarding keyboard-config: `modifiers`

dariogoetz / keyboard_layout_optimizer

A keyboard layout optimizer supporting multiple layers. Implemented in Rust.

https://dariogoetz.github.io/keyboard_layout_optimizer/

GNU General Public License v3.0

85 stars 13 forks source link

Questions regarding keyboard-config: `modifiers` #59

Closed Glitchy-Tozier closed 6 months ago

Glitchy-Tozier commented 1 year ago

Is there a way to simulate a Caps-lock for a layer? Context: I'm thinking about optimizing a Georgian layout (you have seen the emails). It uses non-latin letters but has two layers that users can "caps-lock" to, where they can find latin letters ("abcdefg..."). Currently I'm not sure how to achieve optimization for caps-lock functionality. Could you help me with that?
The modifiers-option itself:
1. It would be great to have some kind of explanation above the setting, explaining what it does, how it is determined which layers these modifiers switch to and what values can be used under type.
2. value wants me to define which MatrixPositions need to be pressed down to activate that modifier, correct? It would be helpful to name that option something indicating what it does. I'm not sure what naming to suggest, however. The best idea I have is, again, matrix_positions, but I'm pretty sure we can think of something even more intuitive.
```
modifiers:
- Left:
type: hold
value: [[0,3]]
Right:
type: hold
value: [[18,3]]
- Left:
type: hold
value: [[0,2]]
Right:
type: hold
value: [[18,2]]
- Left:
type: hold
value: [[1,3]]
Right:
type: hold
value: [[16,4]]
- Left:
type: hold
value: [[0,3], [1,3]]
Right:
type: hold
value: [[18,3], [16,4]]
- Left:
type: hold
value: [[0,2], [1,3]]
Right:
type: hold
value: [[18,2], [16,4]]
```

dariogoetz commented 1 year ago

ad 1. I don't believe that there is a good way to simulate caps-lock, because we only consider statistical data for uni-, bi-, and tri-grams. This means that we consider at most three consecutive symbols to type. In particular, hitting caps-lock and then typing four symbols within the caps-lock layer can not be mapped by our approach.

ad 2.

i. That is a good idea, yes. You can type hold and one-shot. The modifiers field is a list that corresponds to the lists in the base_layout without the first element (first element in modifiers corresponds to the second element in an item in the base_layout, e.g. - ["a", "A", "{", "⇣", "α", "∀"] -> "A", the second element in modifiers would correspond to "{".

ii. value might take either a MatrixPosition or a character, e.g. "⇗". Both is possible (note that the "character" case can lead to the modifier moving during an optimization if the corresponding key is not fixed).

Glitchy-Tozier commented 1 year ago

ad 1.

I agree that there's no perfect way to simulate this, but i have the following suggestion to emulate it as closely as possible: For caps-lock modifiers, it can be assumed that they're not getting pressed as frequently as regular modifiers. They're mostly used if users regularly type a lot of text on the same layer.

Caps-lock modifiers (type: lock?) would come into play when expanding the ngrams. Example: Regular Modifier (I know it's more complicated than shown here.):

aBc → a, shift+b, c
aBC → a, shift+b, shift+c
ABc → shift+a, shift+b, c
ABC → shift+a, shift+b, shift+c

Caps-lock (=lock): We only add modifier activations where we know for certain that the activation took place:

aBc → a, lock, b, lock, c
aBC → a, lock, b, c
ABc → a, b, lock, c
ABC → a, b, c

I think this is a reasonable n-gram expansion when assuming that usually at least a few words would get typed on the same layer.

ad 2.

Where's the difference between hold and on-shot?
Ah, being able to specify both MatrixPositions and chars makes the naming more difficult.

dariogoetz commented 1 year ago

That approach seems reasonable.

The difference between hold and one-shot is similar to what you described for caps-lock only that the modifier only holds for the next key, e.g.

aBc → a, mod, b, c
aBC → a, mod, b, mod, c
ABc → mod, a, mod, b, c
ABC → mod, a, mod, b, mod, c

Glitchy-Tozier commented 1 year ago

The difference between hold and one-shot is similar to what you described for caps-lock only that the modifier only holds for the next key, e.g.
aBc → a, mod, b, c
aBC → a, mod, b, mod, c
ABc → mod, a, mod, b, c
ABC → mod, a, mod, b, mod, c

That's the one-shot? How does hold expand these trigrams?

That approach seems reasonable.

So would you be open to implementing this functionality into the optimizer?

dariogoetz commented 1 year ago

I was not very precise there. Trigrams will be mapped into trigrams, so it would rather be

[a, B, c] → [a, mod, b], [mod, b, c]
[a, B, C] → [a, mod, b], [mod, b, mod], [b, mod, c] 
[A, B, c] → [mod, a, mod], [a, mod, b], [mod, b, c]
[A, B, C] → [mod, a, mod], [a, mod, b], [mod, b, mod], [b, mod, c]

The situation for hold modifiers is much more complicated (especially for trigrams) and also depends on whether the modifiers for A, B, and C are the same or different halves of the keyboard (e.g. if the shift is available on the left and the right halves of the keyboard and a, b, and c are on different halves). Generally, the difference between hold and one-shot modifiers is, that hold modifiers are also considered "active" (held) after the key that they modify and therefore correlate with the next symbol, e.g. for bigrams

[A, b] -> [mod, a], [a, b] and additionally [mod, b]
[a, B] -> [a, mod], [mod, b] and additionally [a, b]

In contrast to that, the one-shot modifier just adds a key in between.

So would you be open to implementing this functionality into the optimizer?

I am generally open, yes. But I would need to find the time to do so.

Glitchy-Tozier commented 1 year ago

hold & one-shot

Got it, I think!

I am generally open, yes. But I would need to find the time to do so.

I may be able to do it myself... Could you give me a rundown of

how difficult implementation will be,
where to do it,
and roughly how it works.

Glitchy-Tozier commented 1 year ago

Another issue I'm facing is that this "locked" layer should be able to get modified.

Example: We want to add the Greek alphabet to a German layout, both lowercase and UPPERCASE letters. To do so we add a lock-modifier to switch to the Greek lowercase layer. Then, we ALSO want another mod (Shift) to switch to the UPPERCASE Greek letters, but only do so when we're on the greek locked layer.

Is this possible to achieve by giving the shift-key a certain symbol that's only available on Greek lowercase layer, then making that symbol a hold modifier?

dariogoetz commented 1 year ago

I am not sure if that is easily possible. Regarding your questions, I think that

the devil will lie in the details (the edge cases)
you would probably need to add another enum variant Lock to the LayoutModifiers in layer.rs and then update the functions process_one_shot_layers in the BigramMapper and TrigramMapper.
Where the one-shot modifiers always add symbols before each actual symbol, the lock modifiers will probably work "between" two symbols and then potentially switch off a layer and potentially switch on another one ("between", because the modifiers need to be pressed upon layer changes).

The general functionality should not be too hard to implement. I see an issue in how whitespace shall be addressed. Whitespace can live in the base layer (no modifiers) and still should not break the layer lock, e.g. A B will not require hitting the modifiers for the locked layer, whereas AcB will). Even more, it is unclear what should happen if the whitespace is at the edge of an ngram.

Consider the text AB ab vs. AB AB: There is a trigram AB in the beginning. In the one case, the layer lock should be removed after the whitespace, in the other not. With the trigram alone (without the context of the next symbol), it is impossible to tell which it would be.

Finally, I have no idea if your other requirement of activating additional layers on top of the locked layer can be easily implemented.

Glitchy-Tozier commented 1 year ago

The general functionality should not be too hard to implement. I see an issue in how whitespace shall be addressed. Whitespace can live in the base layer (no modifiers) and still should not break the layer lock, e.g. A B will not require hitting the modifiers for the locked layer, whereas AcB will). Even more, it is unclear what should happen if the whitespace is at the edge of an ngram.

Isn't this also an issue with the hold modifier?

How does it deal with "a A"? a shift-down space a shift-up? a space shift-down a shift-up? Both, with half the frequency?

How about "A A"? Is shift held or dropped?

Finally, I have no idea if your other requirement of activating additional layers on top of the locked layer can be easily implemented.

Got it. It's not too important at the moment. For now, I'll focus on the pure lock-modifier feature.

dariogoetz commented 1 year ago

This is generally also an issue with the hold modifier when going from one layer via a whitespace to the same layer while holding the modifier, yes (as in your example A A). However, I would argue that this pattern is quite rare in the usual usage pattern of hold modifiers. For a "locked" layer, however, this may appear more often. In particular, as in your use case, when completely different alphabets are involved. I would assume that a typical use case is to write a complete sentence in the secondary alphabet (which is on the lock layer). Then, I would not like to be "thrown" out of the layer by each whitespace symbol.

The case a A expands to something like a [space] a, a [space] [shift], [space] [shift] a (I might have missed variants...)

Glitchy-Tozier commented 1 year ago

For a "locked" layer, however, this may appear more often. In particular, as in your use case, when completely different alphabets are involved. I would assume that a typical use case is to write a complete sentence in the secondary alphabet (which is on the lock layer). Then, I would not like to be "thrown" out of the layer by each whitespace symbol.

Yep, you perfectly captured my thoughts.

Regarding the handling of Layers, I think a α should expand into something like:

a [space] a

a [space] [lock]
[space] [lock] a

a [lock] [space]
[lock] [space] a

Basically, both cases are covered: Space→Lock and Lock→Space.

When staying on the same layer, if there's no explicit switch, there won't be any modifiers added. (Due to the assumption that users will type at least a few words or sentences on that layer.) Thus, α α expands to:

a [space] a

No lock is added before or after that trigram because – for all we know – it is likely that more greek letters will follow.

The main challenge I can see is interaction between layers. What to we do if we want to type symbols that are neither on the base layer nor on the locked layer? For example α}α. Does this expand into a [(un)lock] [Mod3] [some letter] [lock] a? It would be much more comfortable to have a Mod3 on the locked layer as well, resulting in a [Mod3] [some letter] a ... but that's where things get complicated.

dariogoetz commented 1 year ago

I see the situation with the whitespace in the second position as not so problematic as you (I would not consider both space-lock and lock-space, though).

What is unclear is, when the whitespace is at the first or third position in the Trigram. Then you don't know what symbol lies "beyond".

And regarding other layers in between the locked ine, things get really messy, yes. I'm not sure there is a good solution for that.

Glitchy-Tozier commented 1 year ago

I see the situation with the whitespace in the second position as not so problematic as you (I would not consider both space-lock and lock-space, though).

I think I used the wrong example. What bothers me (slightly) more is α a. Same question: Where do we un-lock the layer; before or after the space? Either way it isn't a big deal I think, we just have to decide on one option. Personally, I prefer making the optimizer go for base-layer-spaces in these edge-cases.

What is unclear is, when the whitespace is at the first or third position in the Trigram. Then you don't know what symbol lies "beyond".

I think in this case, as when typing three locked characters, we just assume that the layer will persist. I understand the point that this makes optimization slightly worse for rapidly layer-switching texts. However, in the regular lock use case (where layers don't get changed for at least a few sentences), I think we should focus on intra-layer optimization instead of inter-layer optimization.

And regarding other layers in between the locked one, things get really messy, yes. I'm not sure there is a good solution for that.

I'll first try to understand how that part of the code even works and focus on implementing basic functionality, then think about how this may or may not be possible.

Glitchy-Tozier commented 1 year ago

Hey, a few random things:

1.

In this function, why is the pushing of the base-layer key inside the if-statement?

fn process_one_shot_layers(
    &self,
    unigrams: UnigramIndicesVec,
    layout: &Layout,
) -> UnigramIndicesVec {
    let mut processed_unigrams = Vec::with_capacity(unigrams.len());

    unigrams.into_iter().for_each(|(k, w)| {
        let (base, mods) = layout.resolve_modifiers(&k);
        if let LayerModifiers::OneShot(mods) = mods {
            processed_unigrams.extend(mods.iter().map(|m| (*m, w)));
            processed_unigrams.push((base, w));
        } else {
            processed_unigrams.push((k, w));
        }
    });

    processed_unigrams
}

Wouldn't this do the exact same thing, but shorter?

fn process_one_shot_layers(
    &self,
    unigrams: UnigramIndicesVec,
    layout: &Layout,
) -> UnigramIndicesVec {
    let mut processed_unigrams = Vec::with_capacity(unigrams.len());

    unigrams.into_iter().for_each(|(k, w)| {
        let (base, mods) = layout.resolve_modifiers(&k);
        if let LayerModifiers::OneShot(mods) = mods {
            processed_unigrams.extend(mods.iter().map(|m| (*m, w)));
        }
        processed_unigrams.push((base, w));
    });

    processed_unigrams
}

The same question goes for the functions in bigram_mapper.rs and trigram_mapper.rs.

2.

Doesn't process_one_shot_layers slow down optimization? I noticed that

let mut processed_unigrams = Vec::with_capacity(unigrams.len()); is not the right size. Most of the time, something like "unigrams.len()*1.5" would be the only way to prevent spontaneous vector-growth.
I might be missing something, but why are all the new ngrams pushed into the new processed_{n}grams-vector? This results in many duplicate ngrams. Why not immediately use a HashMap? I haven't tested this, but it sounds like this would be faster.

3.

2. you would probably need to add another enum variant Lock to the LayoutModifiers in layer.rs and then update the functions process_one_shot_layers in the BigramMapper and TrigramMapper.

I'm a bit confused. At first glance it seems like where process_one_shot_layers() is, well, what it's name says. However, could it be that split_bigram_modifiers() would be more appropriately named process_hold_layers()? If that's the case, I'm thinking of adding a process_lock_layers()-function as the first of those functions to be called.

dariogoetz commented 1 year ago

Hey, a few random things:

1.

In this function, why is the pushing of the base-layer key inside the if-statement?

fn process_one_shot_layers(
    &self,
    unigrams: UnigramIndicesVec,
    layout: &Layout,
) -> UnigramIndicesVec {
    let mut processed_unigrams = Vec::with_capacity(unigrams.len());

    unigrams.into_iter().for_each(|(k, w)| {
        let (base, mods) = layout.resolve_modifiers(&k);
        if let LayerModifiers::OneShot(mods) = mods {
            processed_unigrams.extend(mods.iter().map(|m| (*m, w)));
            processed_unigrams.push((base, w));
        } else {
            processed_unigrams.push((k, w));
        }
    });

    processed_unigrams
}

Wouldn't this do the exact same thing, but shorter?

fn process_one_shot_layers(
    &self,
    unigrams: UnigramIndicesVec,
    layout: &Layout,
) -> UnigramIndicesVec {
    let mut processed_unigrams = Vec::with_capacity(unigrams.len());

    unigrams.into_iter().for_each(|(k, w)| {
        let (base, mods) = layout.resolve_modifiers(&k);
        if let LayerModifiers::OneShot(mods) = mods {
            processed_unigrams.extend(mods.iter().map(|m| (*m, w)));
        }
        processed_unigrams.push((base, w));
    });

    processed_unigrams
}

The same question goes for the functions in bigram_mapper.rs and trigram_mapper.rs.

The LayerKeys base and k are different (one is on the base layer, the other (potentially) on some higher layer.) The hold layer is being accounted for at a later stage, therefore we do not want to replace the LayerKey with its underlying base layer key., yet

2.

Doesn't process_one_shot_layers slow down optimization?

It does. This is the reason for the has_one_shot_layer check. This slow down shall only be taken if there are in fact OneShot modifiers.

I noticed that

1. `let mut processed_unigrams = Vec::with_capacity(unigrams.len());` is not the right size. Most of the time, something like "unigrams.len()*1.5" would be the only way to prevent spontaneous vector-growth.

Please try it out and see, if it actually performs better. At some point in time, I tried something like this and found it not faster (even slower, if I recall correctly).

2. I might be missing something, but why are all the new ngrams pushed into the new `processed_{n}grams`-vector? This results in many duplicate ngrams. Why not immediately use a `HashMap`? I haven't tested this, but it sounds like this would be faster.

Here, again: I did lots of experiments with these parts (because they lie on the hot path) and found this to be the fastest version I could find (without sacrificing too much code-simplicity). Try using a HashMap and see for yourself. (You can use the integrated benchmark or run maybe a hundred random evaluations, there is a binary for that).

I think that the main reason for the HashMap being slower is due to the hashing being relatively costly with respect to what is happening otherwise. If you have better insight (or even a better implementation), I would be glad to learn :)

3.

you would probably need to add another enum variant Lock to the LayoutModifiers in layer.rs and then update the functions process_one_shot_layers in the BigramMapper and TrigramMapper.

I'm a bit confused. At first glance it seems like where process_one_shot_layers() is, well, what it's name says. However, could it be that split_bigram_modifiers() would be more appropriately named process_hold_layers()? If that's the case, I'm thinking of adding a process_lock_layers()-function as the first of those functions to be called.

Yes, you are probably right. This is an artefact from ArneBab's namings, I believe.

I have added a new branch hold_modifiers with an initial implementation of the "hold" functionality. Maybe this can support you in getting started. (It is not well-documented and not performance tuned and even not tested very much. I hope it can be of help anyway.)

Glitchy-Tozier commented 1 year ago

The LayerKeys base and k are different (one is on the base layer, the other (potentially) on some higher layer.) The hold layer is being accounted for at a later stage, therefore we do not want to replace the LayerKey with its underlying base layer key., yet

Ah, that makes a lot of sense!

Please try it out and see, if it actually performs better. At some point in time, I tried something like this and found it not faster (even slower, if I recall correctly).

I might try it out at a later point in time, but for the moment, I'll focus on this lock-modifier feature.

I'm a bit confused. At first glance it seems like where process_one_shot_layers() is, well, what it's name says. However, could it be that split_bigram_modifiers() would be more appropriately named process_hold_layers()? If that's the case, I'm thinking of adding a process_lock_layers()-function as the first of those functions to be called.

Yes, you are probably right. This is an artefact from ArneBab's namings, I believe.

May I rename it to something more precisely representing its function?

I have added a new branch hold_modifiers with an initial implementation of the "hold" functionality. Maybe this can support you in getting started. (It is not well-documented and not performance tuned and even not tested very much. I hope it can be of help anyway.)

Thank you, I'll start working on this thing sometime the next few weeks. I've got a few ideas already.

Glitchy-Tozier commented 1 year ago

One thing I'm pretty sure about is that one_shot is fundamentally different form lock and thus shouldn't be placed in the same old one-shot-functions.

Glitchy-Tozier commented 1 year ago

@dariogoetz Why does process_one_shot_layers use this...

keys.iter().zip(keys.iter().skip(1)).for_each(|(lk1, lk2)| {
    processed_bigrams.push(((*lk1, *lk2), w));
});

... at the end of the function, but the function split_bigram_modifiers uses these (among other things)?

TakeTwoLayerKey::new(base1, &mods1, w, self.split_modifiers.same_key_mod_factor)
    .for_each(|(e, w)| {
        bigram_w_map.insert_or_add_weight(e, w);
        // log::trace!("{:>3}{:<3} -> {:>3}{:<3}", layout.get_layerkey(&k1).symbol, layout.get_layerkey(&k2).symbol, layout.get_layerkey(&e.0).symbol, layout.get_layerkey(&e.1).symbol);
    });

To be honest, I still don't fully understand what TakeTwoLayerKey does, except create an iterator of some kind.

dariogoetz commented 1 year ago

First off, you are right in that TakeTwoLayerKey creates an iterator. It is an iterator that takes all keys required to generate a specific symbol on a hold-modifier layer (so the base LayerKey and all hold-modifier LayerKeys that are required to generate it) and iterates over all combinations of pairs that are relevant when using a hold-modifier. This is different from all pairs out of that set since some combinations are not valid for hold-modifiers, e.g. the modifiers are always pressed before the base key.

As an example consider the situation, where ∃ is generated by hitting the e key together with the mod3 and mod4 modifiers (like in the neo2 layout). Then the TakeTwoLayerKey iterator should generate something like the following (omitting some weighting that may go on at the same time):

∃ -> (mod3, e), (mod4, e), (mod3, mod4), (mod4, mod3)

In particular, it does not generate (e, mod3) and (e, mod4), otherwise one could probably use some off-the-shelf iterator.

Now, the one-shot-modifiers are different from hold-modifiers in that multiple modifiers have a clear order. They are expected to be hit one after the other, whereas multiple hold-modifiers do not have an order; they are expected to be pressed simultaneously (at least, the order is irrelevant). This is why the one-shot modifier case is much simpler.

Finally the reason for having this as such a "complicated" iterator instead of something similar is for performance reasons.

Glitchy-Tozier commented 1 year ago

I see, thank you! So if I assume that lock modifiers may consist of multiple keys (for example: Neo shift+shift to caps-lock), I should use TakeTwoLayerKey as well?

Just a heads-up: My first shot at creating process_lock_layers(…)s. There's still a lot to do (implementing it in TrigramMapper, using TakeTwoLayerKey, etc.) and A LOT to test, but the fundamental logic should be there. UnigramMapper:

/// Process layers accessible via `lock` modifiers.
/// Since we assume users typically stay on these layers for many words/sentences, and since unigrams
/// contain no information about certain `lock`-layer-switches, this function transforms
/// the `lock`-layer-keys to base-layer-keys.
fn process_lock_layers(
    &self,
    unigrams: UnigramIndicesVec,
    layout: &Layout,
) -> UnigramIndicesVec {
    println!("len before `process_lock_layers`: {}", unigrams.len());
    let mut processed_unigrams = Vec::with_capacity(unigrams.len());

    unigrams.into_iter().for_each(|(k, w)| {
        let lk = layout.get_layerkey(&k);

        if lk.modifiers.layer_modifier_type().is_lock() {
            let base = layout.get_base_layerkey_index(&k);
            processed_unigrams.push((base, w));
        } else {
            processed_unigrams.push((k, w));
        }
    });
    println!(
        "len after `process_lock_layers`:  {}",
        processed_unigrams.len()
    );

    processed_unigrams
}

BigramMapper:

fn process_lock_layers(&self, bigrams: BigramIndicesVec, layout: &Layout) -> BigramIndicesVec {
    let mut processed_bigrams = Vec::with_capacity(bigrams.len());

    bigrams.into_iter().for_each(|((k1, k2), w)| {
        let lk1 = layout.get_layerkey(&k1);
        let lk2 = layout.get_layerkey(&k2);

        if !lk1.modifiers.layer_modifier_type().is_lock()
            && !lk2.modifiers.layer_modifier_type().is_lock()
        {
            processed_bigrams.push(((k1, k2), w));
        } else {
            let base1 = layout.get_base_layerkey_index(&k1);
            let base2 = layout.get_base_layerkey_index(&k2);
            let mods1 = lk1.modifiers.clone();
            let mods2 = lk2.modifiers.clone();
            let found_whitespace = lk1.symbol.is_whitespace() || lk2.symbol.is_whitespace();

            let mut keys = Vec::new();

            match (
                mods1.layer_modifier_type().is_lock(),
                mods2.layer_modifier_type().is_lock(),
            ) {
                (true, true) => keys.extend([base1, base2]),
                (true, false) => {
                    keys.push(base1);
                    if !found_whitespace {
                        keys.extend(mods1.layerkeys()); // un-lock layer
                    }
                    keys.push(k1);
                }
                (false, true) => {
                    keys.push(k1);
                    if !found_whitespace {
                        keys.extend(mods2.layerkeys()) // lock layer
                    }
                    keys.push(base2);
                }
                (false, false) => {
                    println!("BigramMapper Error: None of the Mods is `lock`!")
                }
            }

            keys.iter().zip(keys.iter().skip(1)).for_each(|(lk1, lk2)| {
                processed_bigrams.push(((*lk1, *lk2), w));
            });
        }
    });

    processed_bigrams
}

dariogoetz commented 1 year ago

Looks like you are on a good track, generally. Some comments:

I believe in the `UnigramMapper´ case, the addition of the modifiers is missing.
Note that locking/unlocking is only then not required, if both layers are identical. There may be different lock-layers; both being activated by Lock modifiers, but for different layers -> here, you would need to un-/lock the layers.

Glitchy-Tozier commented 1 year ago

I believe in the `UnigramMapper´ case, the addition of the modifiers is missing.

That was intentional. The doc-comment of that function explains our reasoning:

/// Process layers accessible via `lock` modifiers.
/// Since we assume users typically stay on these layers for many words/sentences, and since unigrams
/// contain no information about certain `lock`-layer-switches, this function transforms
/// the `lock`-layer-keys to base-layer-keys.

Note that locking/unlocking is only then not required, if both layers are identical. There may be different lock-layers; both being activated by Lock modifiers, but for different layers -> here, you would need to un-/lock the layers.

Ohhhh, that's a very important note! I guess the decision tree will get slightly more complex.