This works to leverage the capabilities of the hashtable as much as
possible, for example using the key context in the value stores.
However, there may still be opportunities to refactor more deeply and
use the functionality even better. Hopefully this is at least
a reasonable start and gets us a clean baseline.
On an Arm M1, this is a 15% improvement on my large lexing stress test,
but ends up a wash on my x86-64 server. This is a smaller benefit than
I expected, and it's because we're using a set-of-IDs and looking up
values with a key context for things like identifiers. This pattern has
a surprising tradeoff. The new hashtable uses significantly less memory,
a 10% peak RSS reduction just from the hashtable change. But indirecting
through the vector of values makes growing the hashtable dramatically
less cache-friendly: it causes growth to randomly access every key when
rehashing. On x86, everything gained by the faster hashtable is lost in
even slower growth. And even on Arm, this eats into the benefits.
But I have a plan to tweak how identifiers specifically work to avoid
most of the growth, and so I suspect this is the right tradeoff on the
whole. It gives us significant working set size reduction and we can
likely avoid the regressed operation (growth with rehash) in most cases
by clever reserving and if necessary by adding a hash caching layer to
the table infrastructure.
This works to leverage the capabilities of the hashtable as much as possible, for example using the key context in the value stores. However, there may still be opportunities to refactor more deeply and use the functionality even better. Hopefully this is at least a reasonable start and gets us a clean baseline.
On an Arm M1, this is a 15% improvement on my large lexing stress test, but ends up a wash on my x86-64 server. This is a smaller benefit than I expected, and it's because we're using a set-of-IDs and looking up values with a key context for things like identifiers. This pattern has a surprising tradeoff. The new hashtable uses significantly less memory, a 10% peak RSS reduction just from the hashtable change. But indirecting through the vector of values makes growing the hashtable dramatically less cache-friendly: it causes growth to randomly access every key when rehashing. On x86, everything gained by the faster hashtable is lost in even slower growth. And even on Arm, this eats into the benefits.
But I have a plan to tweak how identifiers specifically work to avoid most of the growth, and so I suspect this is the right tradeoff on the whole. It gives us significant working set size reduction and we can likely avoid the regressed operation (growth with rehash) in most cases by clever reserving and if necessary by adding a hash caching layer to the table infrastructure.