Closed alphaRGB closed 2 years ago
Hi @alphaRGB, Bucketed cuckoo hash table (BCHT) insertion and find operations are performed in a tile-wide fashion. The two main goals are 1) achieving a coalesced memory access when reading the bucket, 2) avoiding branch divergence between threads. The code snippet that you showed builds a queue (within a tile) of find operations then performs each operation one at a time.
auto work_queue = tile.ballot(do_op); // build a queue bitmap of all threads where `do_op = true`
while (work_queue) { // while there are items in the queue
auto cur_rank = __ffs(work_queue) - 1; // find the first thread rank in the queue
auto cur_key = tile.shfl(find_key, cur_rank); // all tiled bucketed hash table operations expect the same key for all threads in a tile
typename HashMap::mapped_type find_result = map.find(cur_key, tile); // every thread in the tile calls find
if (tile.thread_rank() == cur_rank) {
result = find_result; // thread (in the tile) result
do_op = false; // pop operation
}
work_queue = tile.ballot(do_op); // rebuild the bitmap queue
}
Here is where we read a bucket from memory and where the coalesced memory access is ( the tile size is the same as bucket_size
).
The same idea is used for all bucketed hash tables. find_kernel
and insert_kernel
are only used by 1CHT (cuckoo hash table with a bucket size of one). Coalescing memory accesses is one of the main reasons why BCHT is much faster than 1CHT.
Below is a picture that also explains the code snippet for insertion (same for find) (see slides 36-42 for explanation with pictures):
Let me know if you have more questions!
Hello, I have found that there are two kinds find kernels:
this Kernel
find_kernel
is quit easy to understand, buttiled_find_kernel
seams little hard to understand. The bucketed cuckoo hash usetiled_find_kernel
, I am fused why BCHT needs tile, what's the meaning of these code oftiled_find_kernel
? Could you please help me explain the meaning of it? Thank you in advance