Open ThomasWaldmann opened 7 years ago
One important distinction for the hashtable used in Borg is that the keys are already hashes with ~perfect distribution. Thus tree or trie schemes make no sense at all, and tables optimized to deal with poor distributions expend those optimizations needlessly.
Also no need for variable length keys or values.
I'm not familiar with the internals of borg, but based on the requirements listed above, another option might be a cuckoo hash table. You can find a ready-to-use implementation in C++ here: libcuckoo. Disclaimer: I'm involved in the project, so I'm a bit partial. :)
It's fast--tens to hundreds of millions of operations per second on a modern multi-core machine with small keys and values (a lookup only requires 1-2 memory references). It's also memory efficient--the load factor is typically > 0.9, and per-entry overhead is around one byte.
That said, libcuckoo was designed as a concurrent cuckoo hash table, for applications that want to access the table from multiple threads. If borg is going to use libcuckoo from a single thread, it should still be lean and reasonably fast, but might not get all of libcuckoo's potential performance. Also, I don't know how the cython integration works, but I suspect if sparsehash is a viable option, libcuckoo would also work.
Thought I would throw it out there in case its useful. Thanks for the nice work on borg.
@mkaminsky thanks for the hint. I didn't succeed with writing the C wrapper (that implements our _hashindex.c api), mostly because it gave me lots of strange compiler errors when trying to adapt it and I don't have much clue / practice with C++.
The idea was to have such a C wrapper, so we can just swap it in instead / as an alternative option to what we have now. Would it be easy for you to help with that (see _hashindex.c)?
LF > 0.9 without much slowdown and low memory overhead sounds good. Currently, we do not need concurrency (single-threaded borg 1.0 / 1.1), but in 1.2 we will have multithreading, so it might be useful.
Another thing that we completely forgot to mention here is that the regular performance of the table isn't an issue at this time. The problem are the degenerate cases around tombstoning that can happen even while bounded by load factor, and massively degrade performance.
In the past I considered that maybe a plain-dumb way to solve it would be to track the probe length of lookups, and if they pass a certain threshold you'd just rebuild the table.
I spoke to @manugoyal, who is the main author of the current libcuckoo code. We think (not having done it yet) that adding a C wrapper should be doable. I believe that he's going to take a closer look and maybe give it shot; it could be something that's also useful for other C-based projects.
Regarding the performance: in cuckoo hashing, lookup performance is constant, regardless of the table size, the sequence of inserts/deletes, etc. All of the work happens on insert (which gets slower as the load factor increases). Lookups, though, only ever need to look in a fixed number of buckets. In most implementations (including libcuckoo) it's two buckets. Often you'll even find the key in the first bucket, but worst case, you only have to look in two places. For read-heavy workloads, this is particularly nice.
Hi,
Michael and I have been iterating on a C port of libcuckoo, and we think it is looking fairly doable. The issue is that I don't think we can easily support the exact interface laid out in hashindex.pyx
. There are a few functions that don't map well to the table design:
hashindex_init
: It's hard to initialize the table with variable-sized key and value types, because the C++ interface expects you to use templated Key and Value types with compile-time-defined sizes.
hashindex_next_key
: It's possible to implement, but not really efficient, because you essentially have to do a search to find the location of the given key in the hashtable, so this would require a search every single time. Using iterator objects is the more efficient solution.
So the question is, how important it is to match the exact interface of hashindex.pyx, since an exact match will not really be possible without a non-trivial re-working of the C++ implementation.
Alternatively then, we were thinking of porting the existing C++ interface (which isn't exactly what the borg interface is, but all of the same functionality will be available), to C, and allowing you to easily define multiple versions of the table with different key/value pair sizes. Matching the existing C++ interface would be ideal, so that we might also put this port into the official libcuckoo repo. If you'd instead need a port as close as possible to the hashindex interface, we can do that too, but some re-factoring will still be necessary.
Any thoughts on the matter?
hashindex_init: It's hard to initialize the table with variable-sized key and value types, because the C++ interface expects you to use templated Key and Value types with compile-time-defined sizes.
Not a problem, there are only two or three fixed variants that can be compiled in.
hashindex_next_key: It's possible to implement, but not really efficient, because you essentially have to do a search to find the location of the given key in the hashtable, so this would require a search every single time. Using iterator objects is the more efficient solution.
One level up we use an iterator object so needing some other state / an iterator here would not be a problem.
Since there is a wrapper (hashindex.pyx) that abstracts the C interface away the way the C interface looks doesn't matter that much I think.
We just added a C port of libcuckoo
to the repo. In the interest of making it a public feature, we ported the exact C++ interface to C, so it looks somewhat different than the borg interface. As mentioned before, the port requires you to instantiate different table libraries for different key-value types you're interested in using, but of course each instantiation has the same interface, and multiple table libraries can be used simultaneously.
If I understand correctly, borg is using the hashtable in a single-threaded context, which means the locked_table
interface of libcuckoo
should provide all the functionality required by the borg interface (the locked_table
interface exactly emulates the STL unordered_map
interface). It will probably be necessary to make a wrapper around the interface, but roughly it could work as follows:
static HashIndex *hashindex_read(const char *path)
=> cuckoo_table_read
static int hashindex_write(HashIndex *index, const char *path)
=> cuckoo_table_locked_table_write
static HashIndex *hashindex_init(int capacity, int key_size, int value_size)
=> cuckoo_table_init (no key or value size since each cuckoo_table has pre-defined key and value types)
static const void *hashindex_get(HashIndex *index, const void *key)
=> cuckoo_table_locked_table_find (this uses a special iterator object to point to the kvpair)
static int hashindex_set(HashIndex *index, const void *key, const void *value)
=> cuckoo_table_locked_table_insert (the return value will tell you if you if a new element was inserted or not)
static int hashindex_delete(HashIndex *index, const void *key)
=> cuckoo_table_locked_table_erase
static void *hashindex_next_key(HashIndex *index, const void *key)
=> after running cuckoo_table_locked_table_find, the iterator can move forwards or backwards
The examples
directory in the libcuckoo
repo contains an example program called c_hash
which demonstrates a lot of these features and explains how to create instances of the table.
Would love if somebody could give the port a spin, any feedback is much appreciated!
borg uses a big (custom made, in C) in-memory hashtable for the chunk index and repo index.
current implementation: open addressing == closed hashing (one big bucket table with all entries inside, no separately allocated chains) with linear probing when collisions happen. load factor max. 0.75.
alternatives:
ready-to-use implementations:
general problems:
we need something that is memory efficient:
Other notes: