I had no idea python dictionaries supported regex keys..
Profile this structure against the current one. Since either would benefit from frequency analysis it seems like a fair fight.
Trie tree probably doesn't work because it is deterministic.
This should help a lot if iterating over a dict is faster, it should at least allow for codeql scanning.
Weighted trie-ish tree would still be fun to try; traverse by word, lower weights on regex matches than literal matches, fall back on leaf nodes ending in a regex to literal string match in dict.
I had no idea python dictionaries supported regex keys.. Profile this structure against the current one. Since either would benefit from frequency analysis it seems like a fair fight. Trie tree probably doesn't work because it is deterministic.
https://medium.com/@_bryceli/using-dictionaries-as-regex-in-python-de9033bb3e0f
This should help a lot if iterating over a dict is faster, it should at least allow for codeql scanning.
Weighted trie-ish tree would still be fun to try; traverse by word, lower weights on regex matches than literal matches, fall back on leaf nodes ending in a regex to literal string match in dict.