efficient / libcuckoo

A high-performance, concurrent hash table
Other
1.61k stars 275 forks source link

Who's freeing the locks ? (Potential race condition) #107

Closed X-Ryl669 closed 5 years ago

X-Ryl669 commented 6 years ago

Let's say you want to insert to a hash map that's almost full.

  1. insert_loop fails to insert with table_full and calls cuckoo_fast_double<TABLE_MODE, automatic_resize>(hp) on line 1086
  2. cuckoo_fast_double worked and called maybe_resize_locks on line 1598
  3. maybe_resize_locks create a new lock array, locks all locks in it, append to the locks list's tail and return (line 1701)
  4. cuckoo_fast_double exits, and the all_locks_manager destructor release the all the locks from the previous lock array (yet, the new lock array is still completely locked)
  5. insert_loop retry on line 1087 to take some locks from the new array. Since they are all locked and no-one released them, how could that work ?

I've seen the snapshot_and_lock_all at line 975, but as far as I understand, because it's being called before the new lock array creation, the part that iterate to the next lock array happens before it can find the new array. So, as I understand this, when the AllLocksManager is destructed, it's unlocking all past and future locks array. Because of this, 2 threads that created an AllLocksManager could interfere because there is no ordering guarantee that setting the new lock array pointers in thread 1 is visible in thread 2. Worst, the order of actions seems wrong...

For example, let's see this case:

  1. Thread 1 takes all locks (from 0 to N)
  2. Thread 2 want to resize the table, so waits for any of the locks taken by thread 1
  3. Thread 1 start to releases all locks (but is preempted at line 839)
  4. Thread 2 can lock all locks and actually resize the table creating a new lock array that's locked
  5. Thread 2 start to releases all locks from the former array, but is preempted at line 1601
  6. Thread 1 continue iterating seeing the new lock array in AllLocker's for loop and happily starts to release the locks in this new array
  7. Any other thread will start using locks in the new array while the buckets are not set yet since Thread 2 still hasn't executed the bucket's swap at line 1601 (at least it's not visible to them yet). This means that they can take new locks for buckets that were present in the old array, but as soon as the swap will be performed, there might not be any (valid) bucket anymore at that position.
  8. The universe collapse.

IMHO, the AllLocker's is a mistake since it does not guarantee it's unlocking the lock that were locked beforehand. It should remember which lock array was used when creating the AllLockManager. The maybe_resize_locks should returns an AllLockManager for the new lock array so that the resize method hold both array's lock and release them in this order: new then old. When the swap is done, the new array's lock are released (so there are no thread that could still use the old array since the old array is still locked but they can start using the new array right away), then the old array.

manugoyal-nuro commented 6 years ago

Hey there! Sorry for the delay, just getting to this. I'm reading through your counterexample, and the one thing I'm not sure about is around steps 3 and 4.

It is possible that while thread 1 is unlocking all of its locks, thread 2 can start taking the locks. Which means that once thread 1 completely unlocks the old locks array (but hasn't finished unlocking the new locks), thread 2 can finish taking all of the locks on the old locks array.

But I'm not sure thread 2 would then proceed to actually resize the table. Because if thread 1 has successfully completed a resize, then thread 2 should run check_resize_validity, and see that its hashpowers are out of date and abort the resize. It is not possible to shrink the table with cuckoo_fast_double, so it should always be the case that if thread 1 completes the resize, the hashpowers for any concurrently-executing resize will be out-of-date.

Furthermore, I think all updates made by thread 1 should become visible to thread 2, as soon as thread 2 takes any of the locks, because releasing and acquiring the locks will issue a memory barrier that should ensure that all of thread 1's updates are seen by thread 2.

Appreciate your thinking critically about the concurrency story here, it's definitely possible I'm still missing something, so please let me know if my explanation makes sense!

-Manu

manugoyal commented 5 years ago

Closing for now. Please feel free to reopen if you have more questions!