attractivechaos / khashl

Generic hash table library in C
19 stars 0 forks source link

kh_del does not empty the hash #2

Open slw287r opened 4 months ago

slw287r commented 4 months ago

Hi, Heng

I encounter the following issue when delete elements of a hash, I tried to decrease the iterator and it did the trick, just wonder if it is the correct way to go.

Thanks, Richard

KHASHL_MAP_INIT(KH_LOCAL, kh_t, kh, uint32_t, int, kh_hash_uint32, kh_eq_generic)

int main(void) { int i, absent; khint_t k; kh_t h = kh_init(); for (i = 0; i < 10; ++i) { k = kh_put(h, i, &absent); // get iterator to the new bucket kh_val(h, k) = i i; // set value } kh_foreach(h, k) printf("h[%u]=%d\n", kh_key(h, k), kh_val(h, k)); kh_foreach(h, k) kh_del(h, k); puts("-----------------------"); kh_foreach(h, k) printf("h[%u]=%d\n", kh_key(h, k), kh_val(h, k)); kh_destroy(h); // free return 0; }


* Output

h[4]=16 h[1]=1 h[3]=9 h[6]=36 h[7]=49 h[8]=64 h[0]=0 h[9]=81 h[2]=4 h[5]=25

h[6]=36 h[8]=64


* Decrease `k` after `kh_del` makes the hash empty:

include

include

include "khashl.h"

KHASHL_MAP_INIT(KH_LOCAL, kh_t, kh, uint32_t, int, kh_hash_uint32, kh_eq_generic)

int main(void) { int i, absent; khint_t k; kh_t h = kh_init(); for (i = 0; i < 10; ++i) { k = kh_put(h, i, &absent); // get iterator to the new bucket kh_val(h, k) = i i; // set value } kh_foreach(h, k) printf("h[%u]=%d\n", kh_key(h, k), kh_val(h, k)); kh_foreach(h, k) kh_del(h, k--); puts("-----------------------"); kh_foreach(h, k) printf("h[%u]=%d\n", kh_key(h, k), kh_val(h, k)); kh_destroy(h); // free return 0; }

h[4]=16 h[1]=1 h[3]=9 h[6]=36 h[7]=49 h[8]=64 h[0]=0 h[9]=81 h[2]=4 h[5]=25

attractivechaos commented 4 months ago

Unlike khash, khashl moves contents when you delete keys. As a result, you can't delete all keys in a foreach loop. You can only use foreach to free memory allocated to each bucket.

Decreasing k is an interesting trick I haven't thought of. For now, I believe this always works but it needs more testing. And even if this works, I would not recommend to ordinary users.

slw287r commented 4 months ago

Thanks for the clarification.

JacksonAllan commented 3 months ago

Have a look at I handle deletion during iteration here. Assuming I understand how khashl does back-shift deletion, then we can't just blindly decrease the iterator by one such that it points to the previous bucket because the key moved to the vacated bucket may have been moved from an earlier location in the buckets array (this could occur when a key is deleted near the end of the buckets array). In that case, the moved key will be hit twice during the iteration. Instead, only decrease the iterator if a key was moved from later in the buckets array.

If you want to add deletion-during-iteration functionality to the library, then in terms of API, there are two approaches: