mw8 / white_keyboard_layout

An optimized personal keyboard layout and the tools to create your own
MIT License
100 stars 13 forks source link

Which keyboard layout is the best?

Often in our thinking we arrive at questions like this that have no real answer, and assuming that they do have an answer generally leads to wasted effort. The word "best" has no meaning without a context; "best" under what circumstances? As we learn from mathematics, the way you define a problem is as important as the answer itself. You state your axioms, and from there you can use mathematical methods to see what result those axioms imply.

To most people, keyboard layouts should be evaluated based upon efficiency. We want to use the layout that requires the least amount of effort to do what we want to do with a keyboard. Well, that sentence alone has several ambiguities in it. Not every person will be typing text with the same distribution of letters and numbers and punctuation. Programmers will type differently than transcribers, who will type differently than accountants. Also, "effort" is surprisingly hard to define when it comes to typing. There are many different metrics, such as finger travel distance, same hand utilization, same finger utilization, and even similarity to QWERTY. All of these considerations and more have to be combined somehow into one measure of efficiency, which we call an objective function.

So, to sum up, the real question at hand is really two questions.

  1. Which objective function represents my preferences?
  2. Which layout is the best as measured by that function?

We will explore how these questions are answered by some of the layouts developed so far.

The problems with existing layouts

QWERTY (1873)

QWERTY layout diagram

The objective function informally used in designing QWERTY is how well it can stop typewriters from jamming, while still being relatively efficient. Whether it is actually an optimal layout for this is debatable. As far as anyone can tell, the designer of QWERTY, Christopher Latham Sholes, a newspaper editor and printer from Milwaukee, took the letters of the alphabet, placed them in two rows, moved all the vowels into a new row above that, and then messed with this layout until it was good enough to stop jamming.

Even the physical design of keyboards today is based on the mechanical constraints of the original typewriter. The keys are staggered in that strange way to allow space underneath the keys for the metal levers suspending them. There are grid keyboards today which ignore this obsolete constraint, but they are not popular because they are nonstandard.

Similarly, alternative keyboard layouts besides QWERTY are not popular because they are nonstandard, a consequence of the network effect.

Dvorak (1936)

Dvorak layout diagram

Dr. August Dvorak was a professor of education at the University of Washington. He designed a keyboard layout with finger travel distance, letter frequency, bigram frequency (a bigram is a combination of two letters, like "th" or "st"), and same hand utilization in mind. So what's wrong with that?

Colemak (2006)

Colemak layout diagram

Created by Shai Coleman as an answer to Dvorak, this layout seeks to be easier to learn for people that already know QWERTY by keeping many of the same letter placements. This also serves to preserve many of the most common keyboard shortcuts like those used for undo (Z), cut (X), copy (C), paste (V), select all (A), quit (Q), and close window (W). It also replaces the caps lock key with a backspace key, because CAPS LOCK IS VERY RARELY USED and backspace is used constantly. Other than that, Colemak has many of the same design goals as Dvorak, such as minimizing finger travel distance and having high hand alternation. It also purports to have lower same finger utilization. Here are my criticisms:

Workman (2010)

Workman layout diagram

This layout was developed by a programmer, OJ Bucao, as part of his plan to deal with repetitive strain injury (RSI), along with switching to a grid-type keyboard, with the keys in straight columns. One insight that came with the design of this layout is that finger travel distance is not an important metric. Each finger has a different range of motion and overall strength. The index finger is short and strong; the middle finger, long and strong; the ring finger, long and weak; and the pinky finger, short and weak. Short fingers don't like to stretch to the top row, whereas long fingers don't like to curl to reach the bottom row. Strong fingers should do more work than weak ones. And in general, fingers move vertically (via extending or curling) faster than they do horizontally.

Workman key score diagram

He codified this design concept by creating a customized key score for each key press (shown above). Combining this key scoring procedure with a choice of input text will create an objective function, but the Workman layout has other goals as well. It strives to maintain common QWERTY shortcuts, to work well on common bigrams, to use both hands about equally, and to have high same hand utilization. My relatively minor problems with this layout are

Why a new layout?

I've explained my criticisms of some of the design decisions and metrics emphasized by existing layouts, but I also generally object to the idea that we should all use the same layout. If you use a computer almost every day, the amount of time you spend typing in your life will be much greater than the amount of time you spend learning to type using a new layout.

My preferences derive from the fact that I split my time between scientific writing and programming, and so the choice of input text used in evaluating the objective function should reflect that. I'm not going to be typing out all of "Alice in Wonderland" anytime soon. I'm mainly going to be writing code, emails, IMs, and research papers.

Another important point is that the amount of time I spend on a computer where I can customize the keyboard layout is much, much larger than the time I spend using a computer without a customizable layout. I am willing to fall back on QWERTY for those rare cases.

Lastly, I just find optimization problems like this fascinating, and I have a few new ideas to try out.

  1. Move the right hand home position one key to the right, as shown in the finger assignment diagram below.
  2. Only assume that lower-case letters should be paired with their upper-case letters; any other pairing of characters (the numbers and punctuation) to form a key should be permissible.
  3. Use all of the text I've typed over the last 5 years as the input text for the objective function. This is a good method of predicting what distribution of characters I will type in the future.
  4. Generalize the idea of custom single key scores (from Workman) and create custom scores for double and triple consecutive key presses as well.

Finger assignments diagram

The multibody expansion

That last list item is closely related to an idea from physics called the multibody expansion, which finds use in molecular simulations. The energy of large groups of particles is often very complicated and hard to calculate, due to quantum mechanics, but in many simulations it is simplified and written as a sum of different types of energies: one for single particles, one for pair interactions, one for 3-body interactions, etc. The single particle energy can represent a background potential field. The pair interaction often represents attraction or repulsion between pairs of particles based on the distance between them. The 3-body interaction can be used to characterize bond angles in molecules.

Multibody expansion formula

Similarly, when evaluating the cost of typing a word in a given layout, I want to break down that cost into a sum of single key scores, double key scores, and triples key scores. I am stopping at 3 mostly because quadruple key presses are well approximated by two consecutive triples, but also because there are 4,280,760 scores to consider (though admittedly most of them are zero).

The new objective function

The objective function for this new layout is based on a linear combination of the following scores:

  1. A score for each single key press (similar to that used in designing the Workman layout).
  2. A score for each two consecutive key presses, signifying the difficulty of the transition.
  3. A score for each three consecutive key presses, signifying the difficulty of the transition.
  4. A small penalty score for pressing consecutive keys with different hands.
  5. A small penalty score for having to hold shift while pressing a key.

This scoring method is then evaluated on a collection of text that is representative of all text I've typed in the last five years, which produces a number representing the objective function. Specifically, I take all of my code, IMs, emails, and articles over that time period, remove any non-ASCII characters, and create a long word frequency list to be used in scoring layouts. Lastly I throw out any words that were not used more than 20 times over 5 years, for the sake of efficiency.

Single key scores

The single key scores shown below are a modified version of the scoring system developed for the Workman layout. It addresses my concerns with symmetry and logical consistency and adds scores for all keys in the layout.

single key score diagram

Double key scores

The scores for consecutive pairs of keys are shown below. Each layout diagram shows the score associated with pressing the blue key and then subsequently pressing any other key in the diagram. Keys without a score shown have a score of 0. Finger assignments are indicated by alternating key borders of light grey and black.

Despite the apparent randomness in the scores, they follow several general principles.

double key score diagram

Triple key scores

The triple scores follow similar general principles.

We assign a number to each key, according to the following diagram.

Key number diagram

The triple key scores are then written as a series of 3 key numbers followed by their score.

keys score keys score keys score keys score
1, 3, 4 -0.5 3, 5, 17 -1.0 12, 10, 9 -1.0 47, 46, 44 -0.5
2, 3, 4 -1.0 15, 16, 17 -2.0 12, 11, 21 -1.5 13, 12, 11 -1.0
2, 3, 5 -0.5 15, 16, 18 -1.0 12, 10, 21 -1.0 13, 12, 10 -0.5
14, 3, 4 -1.5 15, 16, 30 -1.5 23, 22, 21 -2.0 24, 12, 11 -1.5
14, 3, 5 -1.0 28, 29, 30 -2.5 23, 22, 20 -1.0 24, 12, 10 -1.0
14, 15, 16 -2.0 28, 29, 31 -1.0 23, 22, 34 -1.5 24, 23, 22 -2.0
27, 15, 16 -1.5 28, 29, 40 -1.5 36, 35, 34 -2.5 25, 12, 11 -1.5
27, 28, 29 -2.0 28, 29, 41 -1.5 36, 35, 33 -1.0 25, 12, 10 -0.5
27, 28, 30 -0.5 28, 16, 30 -1.0 36, 35, 45 -1.5 25, 23, 22 -1.0
27, 29, 30 -0.5 28, 16, 17 -0.5 36, 35, 44 -1.5 37, 36, 35 -2.0
3, 4, 6 -1.0 38, 39, 40 -2.0 36, 22, 34 -1.0 37, 23, 22 -1.5
3, 5, 6 -1.0 38, 39, 41 -0.5 36, 22, 21 -0.5 37, 35, 34 -0.5
3, 4, 17 -1.5 12, 11, 9 -1.0 47, 46, 45 -2.0 37, 36, 34 -0.5

Also, note that every triple is written as an inward roll, moving towards the center of the keyboard with each successive key. Since I prefer inward rolls, I use the same scores as above for outward rolls but I add in a small penalty, which I refer to as the "reversed triple penalty" below.

Parameters

As previous mentioned, the objective function I will use in designing this new layout is made of a linear combination of the single, double, and triple key scores from the previous sections, as well as a shift-holding penalty (added each time you have to hold shift while pressing another key) and a hand-alternation penalty (added each time you alternate hands while typing a word). These parameters must be chosen, and ideally they'd be based on actual timing data from my typing.

So I collected timings between all key presses as an experiment to see how well I could make the objective function score match the timings by tuning the aforementioned parameters, but I ran into a problem. The timings were collected while using the QWERTY layout, so certain keys that form common sequences, such as "ing", always come out faster because I have trained my brain to anticipate them and speed them up in the QWERTY setting. The key scores are supposed to be layout agnostic; they assign scores to combinations of keys that don't depend on which layout is being used, so in future iterations of this work I will have to figure out a strategy to decorrelate the data so that I can have a better basis for choosing the parameters.

Until then, I have performed an informal parameter study and chosen the following parameters based on which optimal layouts are produced with the resulting objective function.

score type coefficient
single score 1.00
double score 1.00
triple score 1.00
shift-holding penalty 1.50
reversed triple penalty 0.25
hand-alternation penalty 0.20

Finding the optimal layout

In order to find the optimal layout for a given objective function, I wrote a program in the Rust programming language that uses a combination of simulated annealing and tabu search. The code is included in this repository so that you too can design your own custom keyboard layout using your own collected works as the text corpus. Just remove or replace any non-ASCII characters in the texts and place them as .txt files in the "texts" directory and type "cargo run --release" to search for an optimal layout.

The layout I originally found using the above parameters is shown below.

first prototype layout diagram

There are a few things to point out. I have forced the numbers to be in the positions they are, purely based on taste. If I didn't constrain them, then they would have been strewn about the layout in a way that helps optimize the objective function more, but I prefer them to be in order on the top row for the sake of memory, organization, and learning. The split between 5 and 6 is necessary because I have moved the right hand over one whole key to the right, relative to where it is in QWERTY, so hitting the 6 would have been much harder had they been kept the same.

For the same reasons that I want the numbers to be order, I want matching punctuation (parentheses, curly brackets, square brackets, and angle brackets) to be next to each other. So based on the previous layout, I looked at their prevalence in the text corpus, compared to other punctuation, and placed them in a way that is relatively optimal. I then forced them to stay where they are and re-ran the optimization program to get the final layout. I made some other minor adjustments, such as pairing '-' and '+'', since that only involved swapping '-' and '_', which have almost the same frequency in the text corpus.

The White keyboard layout

I'm not good at naming things. Plus, I figure that my last name, White, is so common both as a last name and as an English word that it barely counts as being eponymous. Here's the final layout.

White layout diagram

Note that this layout still acts like QWERTY for the purposes of keyboard shortcuts, as discussed above. Also, I have re-mapped my caps lock key to backspace.

Installation

To use the layout on Mac OS X, download the file named "White_layout_mac.keylayout" from this repository and place it in the /Library/Keyboard Layouts/ folder. Then add and select it in System Preferences > Keyboard > Input Sources.

To use the layout on Windows 7 (other versions not confirmed), download the file named "White_layout_win.zip", extract it, and run setup.exe. It will automatically install the correct DLL in the correct place. Or if you don't trust a precompiled version, you can download the Microsoft Keyboard Layout Creator, open the file "White_layout_win.klc" (which is just a plain text description of the layout that you can verify), and compile it for your system. Then add and select it in Control Panel > Region and Languages > Keyboards and Languages.

To use the layout on any other operating system, make it yourself, send it to me, and I'll update this for your system.

Happy typing.