TooBiased / growt

This is a header only library offering a variety of dynamically growing concurrent hash tables. That all work by dynamically migrating the current table once it gets too full.
Other
106 stars 13 forks source link

Using the supplied cmake build system you can compile an example file (~example/example.cpp~) and a multitude of different tests/benchmarks. These were also used to generate the plots displayed in our scientific publications => [[https://dl.acm.org/doi/10.1145/3309206][TOPC journal]] [[https://arxiv.org/abs/1601.04017][technical report (outdated)]].

** New Version (Changes)

All our hash tables support the following functionality:

called on global object

called through handles within the handles, we try to keep the interface as close to ~std::unordered_map~ as possible:

Using handles is not necessary for our non-growing tables.

called through iterators

called through a reference

** About UpdateFunctions Our update interface uses a user implemented update function. Any object given as an update function should have an ~operator()(uint64_t& cur_data, uint64_t key, uint64_t new_data)~ returning a ~uint64_t~. This function will usually be called by our update operations to change a stored elements data field. This operations does not have to be atomic: it is made atomic using CAS operations. When an atomic version of the method exists, then it can be implemented as an additional function called ~.atomic(uint64_t& cur_data, uint64_t key, uint64_t new_data)~, which is only called iff atomics can be called safely (mostly in synchronized growing variants ~usGrow~ and ~psGrow~). Presensce of the atomic variant is detected automatically. An example for a fully implemented ~UpdateFunction~ might look like this:

+BEGIN_SRC cpp

struct Increment { uint64_t operator()(uint64_t& lhs, const uint64_t, const uint64_t rhs) const { return lhs += rhs; }

// an atomic implementation can improve the performance of updates in .sGrow
// this will be detected automatically
uint64_t atomic    (uint64_t& lhs, const uint64_t, const uint64_t rhs) const
{
    __sync_fetch_and_add(&lhs, rhs);
}

};

+END_SRC

All our data structures and connected classes are declared in the ~growt~ namespace.

** Non-growing hash tables

** Growing hash tables Our growing variants use the above non-growing tables. They grow by migrating the entire hash table once it gets too full for the current size. Migration is done in the background without the user knowing about it. During the migration hash table accesses may be delayed until the table is migrated (usually the waiting thread will help with the migration).

Threads can only access our growing hash tables by creating a thread specific handle. These handles cannot be shared between threads.

** Our tests and Benchmarks All generated tests (~make~ recipes) have the same name structure.

~__~ => e.g. ~ins_full_uaGrow~

*** test_abbrv

*** full list of hash tables Some of the following tables have to be activated through cmake options.

Note: in the paper we have some additional third party hash tables. These depend on some additional wrappers and are not reproduced here. Wrappers for their libraries can be found in a branch called legacy_wrappers.

** Including our project To make it easy, you can include the header ~data-structures/table_config.hpp~, which includes all necessary files and offers an interface to choose the right hash table for your workload. Additional hash table modifications can be found in ~data-structures/hash_table_mods.hpp~

+BEGIN_SRC cpp

include "data-structures/table_config.hpp"

using table_type = typename growt::table_config<key_type, mapped_type, hash_function, allocator_type, // any number of hash mods hmod::growable, hmod::deletions>::table_type;

+END_SRC

** About our utility functions The utility functions are now placed in their own submodule [[https://github.com/TooBiased/utils_tm][github repository]]

** We use the following libraries: *** for utility:

*** as third party hash tables (for benchmarks):

** Easy build without third party code

+BEGIN_SRC bash

git clone https://github.com/TooBiased/growt.git cd growt mkdir build cd build cmake .. make

+END_SRC

** Building with third party libraries Third party libraries are either installed using your package manager or they are downloaded into the ~misc/submodules~ folder.

+BEGIN_SRC bash

git clone https://github.com/TooBiased/growt.git cd growt git submodule init mkdir build cd build cmake -D GROWT_BUILD_ALL_THIRD_PARTIES=ON .. make

+END_SRC

note that folly needs quite a lot of extern libraries (zstd, glog, ...) those have to be installed, to compile any test using folly (checkout their github [[https://github.com/facebook/folly]]).