wvwwvwwv / scalable-concurrent-containers

High performance containers and utilities for concurrent and asynchronous programming
Apache License 2.0
306 stars 15 forks source link

Implement `Clone` for HashMap, HashSet, HashIndex, and TreeIndex #57

Closed rsdy closed 2 years ago

rsdy commented 2 years ago

It would be handy to easily .clone() a HashMap or TreeIndex instance that contains a bunch of Arc's. I'm a bit hazy on the internals, but I'm assuming copy-on-write semantics could be possible for memory efficiency, at least in the TreeIndex case.

wvwwvwwv commented 2 years ago

Implementing Clone is not as simple as it seems. It's true that the data structures are full of ebr::Arc for reference counting, but the unit of memory allocation is not a single key-value entry, instead a fixed number of entries sharing the same memory chunk. That brings about a problematic situation where an entry is removed if we are to implement Clone. Currently, the vacancy of each entry slot in a memory chunk is checked using a bit-vector that's located at the head of the memory chunk. Removing an entry means marking a flag on the bit-vector, and if a cloned instance shares the same memory chunk, that also gets to see the removal; that is definitely undesirable. In order to fix this, the bit-vector field must be separated from the data area by either 1. the one that owns the ebr::Arc manages the bit-vector directly, or 2. use a separate heap memory region for the bit-vector. The latter is not quite a good solution as it doubles the average count of heap memory allocation. The former is also bad in terms of memory usage, for instance, HashMap is a two-level data structure that the first level is an array of ebr::Arc, and the second level is the aforementioned array of entries with a bit-vector field; the first level array embedding bit-vectors of the second-level array means almost 4-times larger than the current scheme, and it is really a waste of heap memory in case the HashMap is sparse.

So, in short, in order for HashMap or TreeIndex (the situation is similar, the leaf node is a single memory chunk of multiple entries embedding metadata) to implement Clone, we have to compromise the performance, or the memory usage efficiency. I'll think about what would be the best approach.

wvwwvwwv commented 2 years ago

Clone will be much faster under any circumstances than copy-on-iteration, however its semantics should be precisely defined for each struct first.

wvwwvwwv commented 2 years ago

A rudimentary version of Clone is now implemented for container types: 0.8.2. Will be optimized in 0.8.3.

wvwwvwwv commented 2 years ago

Will look into it in a few weeks (still, not optimized as of 0.10.2).

wvwwvwwv commented 2 years ago

It is possible to bypass some locking and atomic operations while cloning a container type instance, but the expected gain is small (~99% cache hit ~= near-zero atomic operation penalty). This feature will be addressed once a performance problem is newly reported.