red / REP

Red Enhancement Process
BSD 3-Clause "New" or "Revised" License
11 stars 4 forks source link

Objects RAM usage and anomalies there #135

Open hiiamboris opened 2 years ago

hiiamboris commented 2 years ago

One would expect object size (in memory) to be x + yn where n is the number of words. Especially since object cannot be extended, so there's no reason to allocate more than needed.

My measurements however show the following: (obtained using this tool) (1-word object point is not visible as 424 bytes per word doesn't fit the plot; visible points start at 2)

My math may not take into account some implementation details, please correct me where I'm wrong:

  1. At least up to 4 words (but maybe even up to 8-16), I see no reason why object needs a hash table. Just scan all words linearly, and it will likely be faster than hash lookup. If we take 16-byte cell for value slot and 4-byte index in the symbol table, it should only consume 20 bytes per word. 32 bytes if word uses full cell for alignment. Another 20 bytes for the underlying series buffer, leaves 276 bytes occupied by the hash table, which is over a half of 4-word object (424 bytes), and is a total waste for 1-word object which should occupy 52 bytes at most.
  2. After 6 words, it seems like hash size goes boom and starts to dominate the total size, which I can't explain, because hash table size should use powers of 2 for it's bucket sizes. Maybe some effect of the allocator?
  3. Ideally, object size should be roughly proportional to the number of words there. Perhaps a lighter hash table implementation would benefit it. 24 bytes per word should be optimal (4 for the symbol, 4 for the hash bucket slot, 16 for the full value slot).

We should consider RAM usage when we switch to 64 bits. 24 bytes or 300 bytes per word is a huge difference now and will likely double.

qtxie commented 2 years ago

At least up to 4 words (but maybe even up to 8-16), I see no reason why object needs a hash table. Just scan all words linearly, and it will likely be faster than hash lookup.

It's faster only if the number of words is small. The global context can easily contain millions of words, which causes a big issue in some real world Apps. We used linear search in the original implementation. Some users reported super slow startup time in their apps, then we change it to use a hashtable.

hiiamboris commented 2 years ago

Agreed. Global context is a very special object, one of it's kind.

qtxie commented 2 years ago

One option is to have a context! datatype. context [] has hashtable. object [] no hashtable.

hiiamboris commented 2 years ago

I propose 2 hashtable implementations, one full - for maps and global context, another minimal - for other objects. If these implementations expose the same API, object itself doesn't need to know which one it uses. And for small objects I propose using no hash table at all.

Minimal implementation doesn't need any modification support, and may use 16-bit buckets (enough for 65536 words in the object), with 2*N bucket count and linear probing.