Speed up `mapped ngrams()`

Closes #29

Turns out the ideas proposed in #29 don't make that big of a difference. A short optimization run went from ~24.5s to ~23.0s, which is roughly a 6% speed increase. This is the command I used for testing: (with 201 iterations)

time RUST_LOG=INFO ./target/release/optimize_sa --no-cache-results -s jduaxphlmwqßctieobnrsgfvüäöyz,.k

Additionally to the ideas proposed in #29, I also manually implemented PartialEq for LayerKey, as deriving it means that all fields get compared. This used to result in redundant comparisons (assuming that the LayerKey's index unique to a particular Layerkey).

As a safety-check, I compared the result of an evaluation before and after this PR:

Old: Cost: 259.1387 (optimization score: 385893)
New: Cost: 259.1387 (optimization score: 385893)

Before (no changes)

cargo build --release && RUST_LOG=INFO ./target/release/evaluate jduaxphlmwqßctieobnrsgfvüäöyz,.k
    Finished release [optimized] target(s) in 0.12s
[2022-04-02T10:44:42Z INFO  evolve_keyboard_layout::common] Reading unigram file: '"corpus/deu_mixed_wiki_web_0.6_eng_news_typical_wiki_web_0.4/1-grams.txt"'
[2022-04-02T10:44:42Z INFO  evolve_keyboard_layout::common] Reading bigram file: '"corpus/deu_mixed_wiki_web_0.6_eng_news_typical_wiki_web_0.4/2-grams.txt"'
[2022-04-02T10:44:42Z INFO  evolve_keyboard_layout::common] Reading trigram file: '"corpus/deu_mixed_wiki_web_0.6_eng_news_typical_wiki_web_0.4/3-grams.txt"'
Layout (layer 1):
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬──────┐
│ ^ │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │ 0 │ - │ ` │ ←    │
├───┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬────┤
│   ⇥ │ j │ d │ u │ a │ x │ p │ h │ l │ m │ w │ q │ ß │ Ret│
├─────┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┐   │
│    ⇩ │ c │ t │ i │ e │ o │ b │ n │ r │ s │ g │ ⇘ │ ´ │ ⏎ │
├────┬─┴─┬─┴─┬─┴─┬─┴─┯─┴─┬─┴─┬─┴─┯─┴─┬─┴─┬─┴─┬─┴─┬─┴───┴───┤
│  ⇧ │ ⇚ │ f │ v │ ü │ ä │ ö │ y │ z │ , │ . │ k │    ⇗    │
├────┼───┴┬──┴─┬─┴───┴───┴───┴───┴───┴─┬─┴──┬┴───┼────┬────┤
│  ♕ │    │ ♔  │           ␣           │  ⇙ │    │    │  ♛ │
└────┴────┴────┴───────────────────────┴────┴────┴────┴────┘

Layout string (layer 1):
jduaxphlmwqßctieobnrsgfvüäöyz,.k

Layout compact (layer 1):
jduax phlmwqß
ctieo bnrsg⇘
fvüäö yz,.k

Layout metrics:
     1.0000 (weighted:    0.3500) Badly positioned shortcut keys      | Bad shortcuts: x
     2.5000 (weighted:    7.5000) Similar Letters                     | Poorly placed pairs: mn
     0.0000 (weighted:    0.0000) Similar Letter-Groups               | 

Unigram metrics:
  Not found: 0.0218% of 116607824.9096
     0.2534 (weighted:   17.4860) Finger Balance                      | Finger loads % (no thumb): 9.6 11.4 10.3 23.5 - 16.3 10.2 10.1 8.5
     0.0487 (weighted:    1.9464) Hand Disbalance                     | Hand loads % (no thumb): 54.87 - 45.13
     0.0000 (weighted:    0.0000) Row Loads                           | Row 1: 28.2%; Row 2: 63.1%; Row 3: 8.7%
     6.9453 (weighted:   52.4369) Key Costs                           | Worst unigrams: a ( 8.72%), ⇧ ( 6.94%), h ( 6.27%)

Bigram metrics:
  Not found: 0.0329% of 143233718.2575
     0.0145 (weighted:   11.3357) Finger Repeats                      | Worst: ea (10.50%), s. ( 8.20%), ⇩⇧ ( 5.76%);  Worst non-fixed: ea (10.50%), s. ( 8.20%), rl ( 4.31%)
     0.0120 (weighted:    9.3452) Finger Repeats Lateral              | Worst: ex (12.74%), ph ( 7.44%), ⇧c ( 7.29%);  Worst non-fixed: ex (12.74%), ph ( 7.44%), nb ( 5.51%)
     0.0014 (weighted:    2.6108) Repeats Top to Bottom               | Worst: m. (20.75%), 0s (18.28%), hy ( 7.11%);  Worst non-fixed: m. (20.75%), hy ( 7.11%), zh ( 5.98%)
     2.7128 (weighted:   14.9206) Line Changes                        | Worst: .0 (18.75%), `⇗ ( 7.47%), .\n ( 7.27%);  Worst non-fixed: yp ( 4.17%), pr ( 2.80%), py ( 1.70%)
     0.0011 (weighted:    2.3937) Manual Bigram Penalty               | Worst: ff (29.57%), ft (22.19%), vi (15.60%);  Worst non-fixed: ff (29.57%), ft (22.19%), vi (15.60%)
     0.3015 (weighted:   30.1462) Movement Pattern                    | Worst: di ( 8.85%), .\n ( 7.66%), ut ( 4.89%);  Worst non-fixed: di ( 8.85%), ut ( 4.89%), ls ( 3.35%)
    -0.0570 (weighted:  -11.4098) Movement Pattern (same row)         | Worst: ie (-19.22%), ei (-14.43%), te (-12.64%);  Worst non-fixed: ie (-19.22%), ei (-14.43%), te (-12.64%)
     0.4047 (weighted:    7.2839) No Handswitch After Unbalancing Key | Worst: ⇧o (18.58%), ⇧a ( 9.78%), p⇗ ( 9.72%);  Worst non-fixed: pr ( 1.65%), ou ( 1.00%), jo ( 0.84%)
     0.0410 (weighted:    8.2068) Unbalancing After Neighboring       | Worst: .\n (11.72%), pr ( 8.12%), ou ( 4.93%);  Worst non-fixed: pr ( 8.12%), ou ( 4.93%), io ( 4.80%)
    -0.0426 (weighted:   -0.0426) Symmetric Handswitches              | Worst: en (-37.40%), st (-15.87%), ne (-12.05%);  Worst non-fixed: en (-37.40%), st (-15.87%), ne (-12.05%)

Trigram metrics:
  Not found: 0.0414% of 162767371.5744
     0.8658 (weighted:    7.1427) Irregularity                        | Worst: .0\n (10.13%), s.\n ( 9.67%), ⇚⇩⇚ ( 8.51%);  Worst non-fixed: out ( 3.46%), exa ( 3.29%), ivi ( 2.83%)
     0.0275 (weighted:   17.8954) No handswitch in trigram            | Worst: ati ( 3.27%), ite ( 2.64%), ate ( 2.61%);  Worst non-fixed: ati ( 3.27%), ite ( 2.64%), ate ( 2.61%)
   140.8356 (weighted:   70.4178) Secondary Bigrams                   | Worst: ⇧sc ( 1.82%), ⇗uß ( 1.42%), ⇗⇩0 ( 1.28%);  Worst non-fixed: ben ( 0.80%), one ( 0.59%), hey ( 0.59%)
     0.0009 (weighted:    9.1731) Trigram Finger Repeats              | Worst: ⇚⇩⇚ (15.24%), exa ( 7.53%), ⇧⇩⇧ ( 5.21%);  Worst non-fixed: exa ( 7.53%), phy ( 4.49%), exe ( 4.04%)

Cost: 259.1387 (optimization score: 385893)

After PR

cargo build --release && RUST_LOG=INFO ./target/release/evaluate jduaxphlmwqßctieobnrsgfvüäöyz,.k
    Finished release [optimized] target(s) in 0.66s
[2022-04-02T10:44:33Z INFO  evolve_keyboard_layout::common] Reading unigram file: '"corpus/deu_mixed_wiki_web_0.6_eng_news_typical_wiki_web_0.4/1-grams.txt"'
[2022-04-02T10:44:33Z INFO  evolve_keyboard_layout::common] Reading bigram file: '"corpus/deu_mixed_wiki_web_0.6_eng_news_typical_wiki_web_0.4/2-grams.txt"'
[2022-04-02T10:44:33Z INFO  evolve_keyboard_layout::common] Reading trigram file: '"corpus/deu_mixed_wiki_web_0.6_eng_news_typical_wiki_web_0.4/3-grams.txt"'
Layout (layer 1):
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬──────┐
│ ^ │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │ 0 │ - │ ` │ ←    │
├───┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬────┤
│   ⇥ │ j │ d │ u │ a │ x │ p │ h │ l │ m │ w │ q │ ß │ Ret│
├─────┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┬──┴┐   │
│    ⇩ │ c │ t │ i │ e │ o │ b │ n │ r │ s │ g │ ⇘ │ ´ │ ⏎ │
├────┬─┴─┬─┴─┬─┴─┬─┴─┯─┴─┬─┴─┬─┴─┯─┴─┬─┴─┬─┴─┬─┴─┬─┴───┴───┤
│  ⇧ │ ⇚ │ f │ v │ ü │ ä │ ö │ y │ z │ , │ . │ k │    ⇗    │
├────┼───┴┬──┴─┬─┴───┴───┴───┴───┴───┴─┬─┴──┬┴───┼────┬────┤
│  ♕ │    │ ♔  │           ␣           │  ⇙ │    │    │  ♛ │
└────┴────┴────┴───────────────────────┴────┴────┴────┴────┘

Layout string (layer 1):
jduaxphlmwqßctieobnrsgfvüäöyz,.k

Layout compact (layer 1):
jduax phlmwqß
ctieo bnrsg⇘
fvüäö yz,.k

Layout metrics:
     1.0000 (weighted:    0.3500) Badly positioned shortcut keys      | Bad shortcuts: x
     2.5000 (weighted:    7.5000) Similar Letters                     | Poorly placed pairs: mn
     0.0000 (weighted:    0.0000) Similar Letter-Groups               | 

Unigram metrics:
  Not found: 0.0218% of 116607824.9096
     0.2534 (weighted:   17.4860) Finger Balance                      | Finger loads % (no thumb): 9.6 11.4 10.3 23.5 - 16.3 10.2 10.1 8.5
     0.0487 (weighted:    1.9464) Hand Disbalance                     | Hand loads % (no thumb): 54.87 - 45.13
     0.0000 (weighted:    0.0000) Row Loads                           | Row 1: 28.2%; Row 2: 63.1%; Row 3: 8.7%
     6.9453 (weighted:   52.4369) Key Costs                           | Worst unigrams: a ( 8.72%), ⇧ ( 6.94%), h ( 6.27%)

Bigram metrics:
  Not found: 0.0329% of 143233718.2575
     0.0145 (weighted:   11.3357) Finger Repeats                      | Worst: ea (10.50%), s. ( 8.20%), ⇩⇧ ( 5.76%);  Worst non-fixed: ea (10.50%), s. ( 8.20%), rl ( 4.31%)
     0.0120 (weighted:    9.3452) Finger Repeats Lateral              | Worst: ex (12.74%), ph ( 7.44%), ⇧c ( 7.29%);  Worst non-fixed: ex (12.74%), ph ( 7.44%), nb ( 5.51%)
     0.0014 (weighted:    2.6108) Repeats Top to Bottom               | Worst: m. (20.75%), 0s (18.28%), hy ( 7.11%);  Worst non-fixed: m. (20.75%), hy ( 7.11%), zh ( 5.98%)
     2.7128 (weighted:   14.9206) Line Changes                        | Worst: .0 (18.75%), `⇗ ( 7.47%), .\n ( 7.27%);  Worst non-fixed: yp ( 4.17%), pr ( 2.80%), py ( 1.70%)
     0.0011 (weighted:    2.3937) Manual Bigram Penalty               | Worst: ff (29.57%), ft (22.19%), vi (15.60%);  Worst non-fixed: ff (29.57%), ft (22.19%), vi (15.60%)
     0.3015 (weighted:   30.1462) Movement Pattern                    | Worst: di ( 8.85%), .\n ( 7.66%), ut ( 4.89%);  Worst non-fixed: di ( 8.85%), ut ( 4.89%), ls ( 3.35%)
    -0.0570 (weighted:  -11.4098) Movement Pattern (same row)         | Worst: ie (-19.22%), ei (-14.43%), te (-12.64%);  Worst non-fixed: ie (-19.22%), ei (-14.43%), te (-12.64%)
     0.4047 (weighted:    7.2839) No Handswitch After Unbalancing Key | Worst: ⇧o (18.58%), ⇧a ( 9.78%), p⇗ ( 9.72%);  Worst non-fixed: pr ( 1.65%), ou ( 1.00%), jo ( 0.84%)
     0.0410 (weighted:    8.2068) Unbalancing After Neighboring       | Worst: .\n (11.72%), pr ( 8.12%), ou ( 4.93%);  Worst non-fixed: pr ( 8.12%), ou ( 4.93%), io ( 4.80%)
    -0.0426 (weighted:   -0.0426) Symmetric Handswitches              | Worst: en (-37.40%), st (-15.87%), ne (-12.05%);  Worst non-fixed: en (-37.40%), st (-15.87%), ne (-12.05%)

Trigram metrics:
  Not found: 0.0414% of 162767371.5744
     0.8658 (weighted:    7.1427) Irregularity                        | Worst: .0\n (10.13%), s.\n ( 9.67%), ⇚⇩⇚ ( 8.51%);  Worst non-fixed: out ( 3.46%), exa ( 3.29%), ivi ( 2.83%)
     0.0275 (weighted:   17.8954) No handswitch in trigram            | Worst: ati ( 3.27%), ite ( 2.64%), ate ( 2.61%);  Worst non-fixed: ati ( 3.27%), ite ( 2.64%), ate ( 2.61%)
   140.8356 (weighted:   70.4178) Secondary Bigrams                   | Worst: ⇧sc ( 1.82%), ⇗uß ( 1.42%), ⇗⇩0 ( 1.28%);  Worst non-fixed: ben ( 0.80%), one ( 0.59%), hey ( 0.59%)
     0.0009 (weighted:    9.1731) Trigram Finger Repeats              | Worst: ⇚⇩⇚ (15.24%), exa ( 7.53%), ⇧⇩⇧ ( 5.21%);  Worst non-fixed: exa ( 7.53%), phy ( 4.49%), exe ( 4.04%)

Cost: 259.1387 (optimization score: 385893)

One last idea I have is the following:

We find out which characters are fixed_keys: or part of fixed_layers: [2, 3].
We split our corpus into two halves:
- The ngrams that only contain fixed characters
- The ngrams that contain any number of unfixed chars
Then, we prepare the fixed chars only once
Now every time we call map_unigrams, we only map unfixed chars and finally add them to the fixed ones.

The fixed chars & fixed layers include the following characters:

…_[]^!<>=&ſ\/{}*?()-:@#$|~`+%"';1234567890¡¿:−ªº№·£¤\t\n/*−°§ℓ»«$€„“”₁₂₃♀♂⚥ϰ⟨⟩₀‑¬∨∧⊥∡∥→∞∝    ∅

When using regex and searching for trigrams which contain those characters

 […_\[\]\^!<>=&ſ\\/\{\}\*\?\(\)-:@#\$\|~`\+%"';1234567890¡¿:−ªº№·£¤\n\t/\*−°§ℓ»«\$€„“”₁₂₃♀♂⚥ϰ⟨⟩₀‑¬∨∧⊥∡∥→∞∝    ∅]{3}\n

, we find that about 16_000 of the 290_000 Trigrams contain only fixed chars. Thus, this idea might boost performance by an additional ~ 5%. That being said, I'm not sure how to implement this concept, so I'll leave it be.

I have mixed feelings about this pull request.

On one hand, I see that using a HashMap instead of the Vec for the *GramIndices saves some conversions. But I am a little uncomfortable with then having both structures there. Is it possible to get rid of the *GramIndicesVec version and only convert them into a Vec in the last step, when the values are being placed into the MappedNgrams? Then we would only keep the HashMap version and thus save the conversions.
Also I see that there are now multiple instances where a AHashMap is used in favor of the FxHashMap. I would prefer being consistent here and use AHashMap everywhere (if performance does not degrade too much) while completely getting rid of the FxHashMap. This then also removes the rustc_hash dependency. Did you check what the performance impact would be using only AHashMaps?
Finally, I see your reasoning for implementing the PartialEq manually. However, I do not like having the index inside the LayerKey. I understand that it is only meant to be a "unique" value. But using the LayerKeyIndex is very misleading as it encodes information "from the outside". How much time does using the "automatic" implementation of PartialEq actually cost?

Hi :)

Your commit

The / 3 in let mut bigram_w_map = AHashMap::with_capacity(bigrams.len() / 3); was intentional. Due to the fact that we do not fill the hashmap with any duplicates (and due to the fact that we only fill it with Layer1-letters), it's usually smaller than 1/3rd or ever 1/4th or 1/5th of bigrams.len(). Thus I added the / 3 because it seems to be the most efficient thing to do. I'd prefer keeping the / 3, but if you know of scenarios I forgot, I'm happy to remove the / 3.

On one hand, I see that using a HashMap instead of the Vec for the *GramIndices saves some conversions. But I am a little uncomfortable with then having both structures there. Is it possible to get rid of the *GramIndicesVec version and only convert them into a Vec in the last step, when the values are being placed into the MappedNgrams? Then we would only keep the HashMap version and thus save the conversions.

I share your feeling that this is ugly. In my tests this actually had a performance impact, and was the main thing that saved time. (The more correct version of that sentence is: "Changing this into a HashMap slowed everything back down, nullifying the other improvements".) This might be due to one of two reasons:

Inserting Items into a HashMap might be slower than pushing them into a Vec (which would make sense as additional hashing-computations are performed) (Which happens repeatedly in fn map_unigrams(, fn map_bigrams(, and fn map_trigrams(.)
Iterating through a HashMap might be slower (no idea how reasonable this idea is)

That being said:

my tests might have been imperfect
we could just remove the type but still use a Vec to return the values from that one function.

Also I see that there are now multiple instances where a AHashMap is used in favor of the FxHashMap. I would prefer being consistent here and use AHashMap everywhere (if performance does not degrade too much) while completely getting rid of the FxHashMap. This then also removes the rustc_hash dependency.

I started using AHashMap due to their comparison to other hashers. Accordingly, I did my best to consintently use one or the other using the following rule:

FxHash for primitive keys
AHash for more complex keys such as Tuples or Strings.

Did you check what the performance impact would be using only AHashMaps?

I did not and it's probably negligible. We can use AHashMap everywhere, if you prefer. :) If we decide on removing FxHash…:

Let's wait until the two PRs are gone, to avoid constant merge-conflicts
It's probably most efficient to replace the really small FxHashSets (such as exclude_rows or initial_pause_indicators) with Vecs

Finally, I see your reasoning for implementing the PartialEq manually. However, I do not like having the index inside the LayerKey. I understand that it is only meant to be a "unique" value. But using the LayerKeyIndex is very misleading as it encodes information "from the outside". How much time does using the "automatic" implementation of PartialEq actually cost?

How much time does using the "automatic" implementation of PartialEq actually cost?

Not much. I assume this is because most of the time the first fields are not equal to each other, and thus the "comparison-train" is stopped preemptively.

But using the LayerKeyIndex is very misleading as it encodes information "from the outside".

I don't quite understand this part. Is it problematic because it references a "higher order concept", such as the layout? LayerKeyIndex = "This is my position in the Layout"? In that case, my proposal is the following: Rename pub index: LayerKeyIndex into pub id: u16. This way, we do not imply any relation to the layout (and thus are not misleading), but still are able to do efficient comparisons. What do you think?

Actually … there's no need at all to fill either a Vec or a HashMap in fn map_trigrams. It's redundant. We use the value only once (to iterate over it), thus we can simply return an Iterator.

This skips

Pushing > 200_000 trigrams into a vec
turning that vec back into an iterable.

I've got a fun & consise idea on how to implement this, if you give me the go :)

Hi :)

Your commit

The / 3 in let mut bigram_w_map = AHashMap::with_capacity(bigrams.len() / 3); was intentional. Due to the fact that we do not fill the hashmap with any duplicates (and due to the fact that we only fill it with Layer1-letters), it's usually smaller than 1/3rd or ever 1/4th or 1/5th of bigrams.len(). Thus I added the / 3 because it seems to be the most efficient thing to do. I'd prefer keeping the / 3, but if you know of scenarios I forgot, I'm happy to remove the / 3.

I am okay with this, if you checked that these relations are indeed as you suggest. My understanding was that a single ngram with higher-layer symbols will be "expanded" into several new ngrams involving modifiers. But I do not remember the quantities anymore.

On one hand, I see that using a HashMap instead of the Vec for the *GramIndices saves some conversions. But I am a little uncomfortable with then having both structures there. Is it possible to get rid of the *GramIndicesVec version and only convert them into a Vec in the last step, when the values are being placed into the MappedNgrams? Then we would only keep the HashMap version and thus save the conversions.

I share your feeling that this is ugly. In my tests this actually had a performance impact, and was the main thing that saved time. (The more correct version of that sentence is: "Changing this into a HashMap slowed everything back down, nullifying the other improvements".) This might be due to one of two reasons:

If the performance gains/losses are not orders of magnitude, I definitely prefer an understandable and clean code over "little" performance differences; be it with a Vec or a HashMap.

1. Inserting Items into a HashMap might be slower than pushing them into a Vec (which would make sense as additional hashing-computations are performed) (Which happens repeatedly in `fn map_unigrams(`, `fn map_bigrams(`, and `fn map_trigrams(`.)

2. Iterating through a HashMap might be slower (no idea how reasonable this idea is)
That being said:
1. my tests might have been imperfect

2. we could just remove the type but still use a Vec to return the values from that one function.
Also I see that there are now multiple instances where a AHashMap is used in favor of the FxHashMap. I would prefer being consistent here and use AHashMap everywhere (if performance does not degrade too much) while completely getting rid of the FxHashMap. This then also removes the rustc_hash dependency.

I started using AHashMap due to their comparison to other hashers. Accordingly, I did my best to consintently use one or the other using the following rule:
* FxHash for primitive keys

* AHash for more complex keys such as Tuples or Strings.

With the same reasoning as above (prefer clean code over only little performance gains), I would suggest just using AHashMap/AHashSet.

Did you check what the performance impact would be using only AHashMaps?

I did not and it's probably negligible. We can use AHashMap everywhere, if you prefer. :) If we decide on removing FxHash…:
* Let's wait until the two PRs are gone, to avoid constant merge-conflicts

Ok.

* It's probably most efficient to replace the really small `FxHashSets` (such as `exclude_rows` or `initial_pause_indicators`) with Vecs

I am not sure. With HashMaps we get an O(1) lookup. Maybe for small Vecs this is still slower... haven't checked.

Finally, I see your reasoning for implementing the PartialEq manually. However, I do not like having the index inside the LayerKey. I understand that it is only meant to be a "unique" value. But using the LayerKeyIndex is very misleading as it encodes information "from the outside". How much time does using the "automatic" implementation of PartialEq actually cost?

How much time does using the "automatic" implementation of PartialEq actually cost?

Not much. I assume this is because most of the time the first fields are not equal to each other, and thus the "comparison-train" is stopped preemptively.

But using the LayerKeyIndex is very misleading as it encodes information "from the outside".

I don't quite understand this part. Is it problematic because it references a "higher order concept", such as the layout? LayerKeyIndex = "This is my position in the Layout"? In that case, my proposal is the following: Rename pub index: LayerKeyIndex into pub id: u16. This way, we do not imply any relation to the layout (and thus are not misleading), but still are able to do efficient comparisons. What do you think?

The main danger that I see is that with an id or index we allow misuse of the interface. If a user of the LayerKey sets a wrong id (the same id for different LayerKeys) he gets wrong results (comparison being true even though the LayerKeys are not equal). Having such a field always requires that it is unique, but we leave it to the user to actually enforce this. To me this screams "bad/dangerous interface".

Actually … there's no need at all to fill either a Vec or a HashMap in fn map_trigrams. It's redundant. We use the value only once (to iterate over it), thus we can simply return an Iterator.

This skips
1. Pushing > 200_000 trigrams into a vec

2. turning that vec back into an iterable.
I've got a fun & consise idea on how to implement this, if you give me the go :)

Skipping 1. would be nice. Point 2 is not expensive.

Again, if the idea that you have keeps the code simple, please give it a go. If it gets too complex and the gains are small (<15% or so), I would be uncomfortable with it.

I gave up on the iterator-thing (they're messy). However, I was able to prevent the >200000 pushes by using filter_map().collect().
I moved BigramIndicesVec to the respective files, as it's not used anywhere else. In theory we could do the same thing with BigramIndices I think. Anyway, what is your opinion on the setup with the moved BigramIndicesVec-type definition? If you still dislike it, tell me and I'll replace it with a HashMap.
I tried to remove the index but then found out that my pretty little improvement of pub fn add_secondary_bigrams_from_trigrams() gets destroyed by removing the index. This was the original reason for adding the index. Do you still want to remove it?

I gave up on the iterator-thing (they're messy). However, I was able to prevent the >200000 pushes by using filter_map().collect().

This looks cleaner. I like it. (The pushes are still done within collect, though.)

2. I moved `BigramIndicesVec` to the respective files, as it's not used anywhere else. In theory we could do the same thing with `BigramIndices` I think.

Good idea. Having them there was a historical artefact from a time when there was a "not-on-demand" NGramMapper variant.

   Anyway, what is your opinion on the setup with the moved `BigramIndicesVec`-type definition? If you still dislike it, tell me and I'll replace it with a `HashMap`.

Having these types only in this module makes it alright, I think. No replacement required.

3. I tried to remove the index but then found out that my [pretty little improvement of `pub fn add_secondary_bigrams_from_trigrams()`](https://github.com/dariogoetz/keyboard_layout_optimizer/pull/31/files#diff-9310aba2f5f0f1f27a7ab6df86be60e2678da8b8a12c3b94fffbcce3629cc512L96-L140) gets destroyed by removing the index.
   This was the original reason for adding the index. Do you still want to remove it?

I would still like to see it removed.

I would still like to see it removed.

Good idea. Having them there was a historical artefact from a time when there was a "not-on-demand" NGramMapper variant.

Do you also want me to move the "normal" BigramIndices?

I would still like to see it removed.

Ok

Good idea. Having them there was a historical artefact from a time when there was a "not-on-demand" NGramMapper variant.

Do you also want me to move the "normal" BigramIndices?

Yes, please.

I've implemented what we have talked about.
Somehow I'd like to move this filtering mechanism
```
// if the same modifier appears consecutively, it is usually "hold" instead of repeatedly pressed
// --> remove
let trigrams = trigrams
.into_iter()
.filter(|((k1, k2, k3), _)| !(k2.is_modifier && (k1 == k2 || k2 == k3)))
.collect();
```
into OnDemandTrigramMapper::layerkeys(&trigram_key_indices, layout);. I'd do the same thing for bigrams as well. This would prevent another allocation of a Vec and make the code in map_ngrams() more uniform. See the two most recent commits. It seems more efficient in every way. Less work is done, less vectors are filled.

The only question is whether you feel that this might be a misuse of OnDemandTrigramMapper::layerkeys(&trigram_key_indices, layout);. To me it seems fine. Many other methods (such as bigram_mapper::add_secondary_bigrams_from_trigrams() or split_trigram_modifiers()) also perform internal filtering.

It is certainly not beautiful. Optimally, all kinds of filtering or additions of the set of ngrams would be within functions that obviously do so. add_secondary... and split...modifiers obviously change the ngram set, so there it is okay.

But probably it's fine here.

Alright. Later on today I'll at least add a doc-comment to those functions, explaining that there's also some slight filtering going on. After that, I'll re-test and merge the PR unless anything new comes up.

1.

The purpose of this PR is to improve performance while still keeping the script elegant and clean. I did my best to rename the affected functions to indicate their added functionality. If there's anything else you dislike, please tell me.

2.

I've found and added a way to use filter_map() and .collect() without loosing performance. The problem was that filter_map() (and the regular filter()) set the minimum size of the iterator's size_hint() to 0, meaning that – as opposed to other circumstances – .collect() will not allocate any memory.

A more elegant solution might be to write our own collect_for_capacity()-method which roughly will look like this:

impl CapacityCollect for Iterator {
    fn collect_for_capacity(&self, capacity: usize) -> Vec[T] {
        let mut result = Vec::with_capacity(capacity);
        result.extend(self);
        result
    }
}

The purpose of this extraction would be that we could use it in other contexts; Not necessarily to improve speed, but to simplify code like this, where there's actually not much going on, but it could use some decluttering:

let mut metric_costs: Vec<MetricResult> = Vec::new();
for (weight, normalization, metric) in self.layout_metrics.iter() {
    let (cost, message) = metric.total_cost(layout);
    metric_costs.push(MetricResult {
        name: metric.name().to_string(),
        cost,
        weight: *weight,
        normalization: normalization.clone(),
        message,
    });
}

The only problem is that

I don't know how you feel about this idea (I myself think that it might be overkill)
I don't know where this implementation should be placed to be accessible in different modules

3.

What are your thoughts on implementing a simple validation of the (keyboard-)config? I could add it after this PR.
If that config was added, would you be down to keep the less redundant PartialEq-implementation, as there's no more possibility of the user to mess it up? (We would check for duplicate matrix-positions in the validation.)

1.

The purpose of this PR is to improve performance while still keeping the script elegant and clean. I did my best to rename the affected functions to indicate their added functionality. If there's anything else you dislike, please tell me.

The naming is better now. I'm fine with it.

2.

I've found and added a way to use filter_map() and .collect() without loosing performance. The problem was that filter_map() (and the regular filter()) set the minimum size of the iterator's size_hint() to 0, meaning that – as opposed to other circumstances – .collect() will not allocate any memory.

A more elegant solution might be to write our own collect_for_capacity()-method which roughly will look like this:
impl CapacityCollect for Iterator {
    fn collect_for_capacity(&self, capacity: usize) -> Vec[T] {
        let mut result = Vec::with_capacity(capacity);
        result.extend(self);
        result
    }
}
The purpose of this extraction would be that we could use it in other contexts; Not necessarily to improve speed, but to simplify code like this, where there's actually not much going on, but it could use some decluttering:
let mut metric_costs: Vec<MetricResult> = Vec::new();
for (weight, normalization, metric) in self.layout_metrics.iter() {
    let (cost, message) = metric.total_cost(layout);
    metric_costs.push(MetricResult {
        name: metric.name().to_string(),
        cost,
        weight: *weight,
        normalization: normalization.clone(),
        message,
    });
}
The only problem is that
1. I don't know how you feel about this idea (I myself think that it might be overkill)

2. I don't know where this implementation should be placed to be accessible in different modules

I like the solution that you found (::with_capacity and then .extend). But I feel that having it in a separate trait is unnecessary. Keep it as it is, I would say.

3.

1. What are your thoughts on implementing a simple validation of the (keyboard-)config? I could add it after this PR.

This is a good idea. Checking user-provided configs is good in any case.

2. If that config was added, would you be down to keep the less redundant `PartialEq`-implementation, as there's no more possibility of the user to mess it up? (We would check for duplicate matrix-positions in the validation.)

I still dislike the manual PartialEq-implementation. The "user" in the sense of "user of the application" may not, but "user of the data structure" might still hit the mentioned confusions.

PS: Sorry. I was logged into the wrong github-account.

Alright
Alright
I still think it's unfortunate but alright. I'll remove the manual implementation and merge the PR sometime tomorrow.

dariogoetz / keyboard_layout_optimizer

Speed up `mapped ngrams()` #31

Before (no changes)

After PR

1.

2.

3.

1.

2.

3.