Open jahorton opened 10 months ago
Profiling the current state of predictive text from 17.0 on a SM-T350:
Far and away, the correction search uses most of the predictive-text runtime when significant typos occur. getBestMatches
is the primary entry-point for correction searching.
A first-level breakdown for getBestMatches
:
From there, the biggest costs are at the following points:
buildSubstitutionEdges
.children
)bulidDeletionEdges
buildInsertionEdges
.children
)buildDeletionEdges
addInputChar
callsenqueueAll
- priority queue operations to pick the next check when searchingget mapKey
- which is probably what the original description was talking aboutThe thing is, the big, expensive operations there... boil down to iteration + O(1) conditionals. We're just iterating over that much data trying to find corrections that work. It is done via synchronous generators (partly due to the trie's structure) for that iteration, and as best as I can tell, the back and forth between generator method and iteration based upon its results seems to be the biggest runtime contributor.
The parts talked about in the base description appear to be comparatively small in comparison to this, so I believe that optimizing on that basis to be extremely low-priority. The two parts I can identify at this time:
get mapKey
(mostly within buildSubstitutionEdges
)get currentCost
:
These two sum up to about 100ms total out of 2070ms. They're not completely negligible, and are likely easy to update... but the returns are pretty marginal.
If anything, finding a way to make iteration through the dictionary more efficient would yield better returns. It is rather surprising just how high the accumulated costs are for that in comparison to the other components. We generally do iterate over all the children within the correction-search process, so building a full array and returning that may be notably better.
https://stackoverflow.com/a/70672133 suggests that simply stashing each returned value into an array, in place of each yield
, could possibly double the speed of such sections.
I took the time today to experiment with this and see if replacing the generator component of LexiconTraversal
- its .children()
member - might help. I didn't get significant results from the attempt though - it doesn't appear to make much difference, likely since the overhead is comparatively smaller to certain aspects of Trie navigation.
If there was to be a noticeable effect from changing the LexiconTraversal
setup, we'd expect to see strong differences in buildInsertionEdges
and buildSubstitutionEdges
. The former was notably smaller, but the latter was notably larger. It's possible that this was due to random variations, since the profiling run was only about 15-18 seconds for both runs. I noticed a much larger value in a higher-level "total time"... that was largely due to a "V8.StackGuard" entry, which isn't something we're responsible for.
I used the Android app on a SM-T350, aiming to type the following string both times: tesring out predictions without the geneeator function
. (Note: deliberate typos were used here, since that's part of what predictive-text is aimed to correct.)
Originally posted by @jahorton in https://github.com/keymanapp/keyman/issues/10127#issuecomment-1880586159
Reference this and other comments in #10127 for more details.