pingtype / pingtype.github.io

15 stars 2 forks source link

Character Spacing lookup: Slowed down due to inefficient dictionaries #1

Open marcusmueller opened 6 years ago

marcusmueller commented 6 years ago

As far as I can tell, the lookup mechanism for words is to blame for the slow word spacing lookup:

pingtype uses length-separate dictionaries (e.g. fiveWords). Each of these dictionaries is a string, built by

"\n" + word1 + "\n" + word2 + "\n"

Lookup is done using the contains function:

contains(fiveWords, "\n" + thisWord + "\n");

This is inherently very slow. To explain:

contains has to find "\n" + thisWord + "\n" in fiveWords, which is just a linear string. So, it has to go through all characters of fiveWords, comparing them to the beginning of thisWord; if these match, compare on, else continue.

Now, since your needle always begins with a "\n", and that also happens to be most common character in the haystack, there's twice as many false alarms for a first character match as there are words in the dictionary. That's bad. Maybe it's not as bad, though, because probably modern JS runtimes are clever and compare e.g. 8B at once.

Replace the humongous string with an appropriate data structure.

An appropriate data structure would be something that

In lieu of a proper standard library, I'd recommend using JavaScript's Set built-in type. I'd assume it's backed by an elegant hash map that facilitates and speeds up such lookups enormously.

Note that I'm not a JavaScript developer at all; it's just that I read your source code after I saw your Hackaday comment saying that spacing was slow, and yes, going through all of the dictionary for all of the characters typed linearly will have that effect.

pingtype commented 6 years ago

Thanks for actually giving technical feedback, and being the only person except me to look at the source code! What data structure do you suggest? It's necessary to regenerate the data structure every time the page is refreshed, because dictionary edits happen all the time. Everything must run in browser in session storage. I can't spend any money on a server. Unfortunately, I can't use a Set, because it's only supported in Safari 8. I use Safari 7 on my Mac, and iOS 6 on my phone. If there's a tradeoff between speed or functionality, I choose code that works.