baotonglu / dash

Scalable Hashing on Persistent Memory
MIT License
187 stars 26 forks source link

Question about the concurrency of Directory update #5

Closed shiniaas closed 3 years ago

shiniaas commented 4 years ago

In ex_finger.h line 1948 to 1972

    old_sa = dir;
    dir_entry = old_sa->_;
    x = (key_hash >> (8 * sizeof(key_hash) - old_sa->global_depth));
    if (target->local_depth < old_sa->global_depth) {
      if (Test_Directory_Lock_Set()) {
        goto REINSERT;
      } // Thread 1 here hangs up
      ADD(&old_sa->counter, 1);
      Directory_Update(old_sa, x, new_b, target);
      SUB(&old_sa->counter, 1);
      /*after the update, need to recheck*/
      if (Test_Directory_Lock_Set() || old_sa->version != dir->version) {
        goto REINSERT;
      }
    } else {
      Lock_Directory();
      if (old_sa->version != dir->version) {
        Unlock_Directory();
        goto REINSERT;
      }
      while (old_sa->counter != 0) // thread 2 passes this before the counter increment
        ;
      Directory_Doubling(x, new_b, target); // now thread 1 in Directory_Update and thread 2 in Directory_Doubling
      Unlock_Directory();
    }

I think the counter of dir is something like a RWLock, but it may not run as expected. Consider a situation: Thread 1 Test_Directory_Lock_Set pass but hang up before ADD(&old_sa->counter, 1); Thread 2 Lock_Directory and test old_sa->counter is zero, so it begins Directory_Doubling. Then thread 1 and thread 2 may operate together on the dir.

shiniaas commented 4 years ago

Maybe it doesn't matter? Since thread 1 will go to retry.

baotonglu commented 3 years ago

Hi,

Sorry for the late reply. I don't think it has concurrency bugs but I agree it is hard to understand. Thus now I merge the counter with the directory lock as one RWLock, which simplifies the concurrency logic. Could you take a look?

Best, Baotong

shiniaas commented 3 years ago

Thanks for your reply. What I want to know is whether Directory_Update and Lock_Directory can run concurrently which may happen in such a situation described at the beginning. (https://github.com/baotonglu/dash/issues/5#issue-714591513). It seems in current implementation this situation won't happen.

baotonglu commented 3 years ago

In previous implementation, your proposed situation could happen but does not influence the correctness since thread 1 will retry. In current implementation, this situation won't happen.

shiniaas commented 3 years ago

Thanks!