Open hiiamboris opened 2 years ago
At least up to 4 words (but maybe even up to 8-16), I see no reason why object needs a hash table. Just scan all words linearly, and it will likely be faster than hash lookup.
It's faster only if the number of words is small. The global context can easily contain millions of words, which causes a big issue in some real world Apps. We used linear search in the original implementation. Some users reported super slow startup time in their apps, then we change it to use a hashtable.
Agreed. Global context is a very special object, one of it's kind.
One option is to have a context! datatype. context [] has hashtable. object [] no hashtable.
I propose 2 hashtable implementations, one full - for maps and global context, another minimal - for other objects. If these implementations expose the same API, object itself doesn't need to know which one it uses. And for small objects I propose using no hash table at all.
Minimal implementation doesn't need any modification support, and may use 16-bit buckets (enough for 65536 words in the object), with 2*N bucket count and linear probing.
One would expect object size (in memory) to be
x + yn
wheren
is the number of words. Especially since object cannot be extended, so there's no reason to allocate more than needed.My measurements however show the following: (obtained using this tool) (1-word object point is not visible as 424 bytes per word doesn't fit the plot; visible points start at 2)
My math may not take into account some implementation details, please correct me where I'm wrong:
We should consider RAM usage when we switch to 64 bits. 24 bytes or 300 bytes per word is a huge difference now and will likely double.