concurrencykit / ck

Concurrency primitives, safe memory reclamation mechanisms and non-blocking (including lock-free) data structures designed to aid in the research, design and implementation of high performance concurrent systems developed in C99+.
http://concurrencykit.org/
Other
2.34k stars 312 forks source link

patch to ck_sequence_write_begin & ck_sequence_write_end broke my code #213

Closed ajschorr closed 9 months ago

ajschorr commented 9 months ago

Hi,

I have been using ck 0.4.5 for ages. I decided to try updating to 0.7.0, and my code is now broken. I have traced the problem to commit 3f48bc39fb128e2cda57920b86082719d3b8d004:

commit 3f48bc39fb128e2cda57920b86082719d3b8d004 Author: Emilio G. Cota cota@braap.org Date: Wed Jul 1 17:55:59 2015 -0400

ck_sequence: relax sequence increment from atomic to regular store

The atomicity of the sequence number's increment is unnecessary, since
there should be only one writer at any given time. Fix it by changing
it for a regular increment + store.

Signed-off-by: Emilio G. Cota <cota@braap.org>

diff --git a/include/ck_sequence.h b/include/ck_sequence.h index 46ceb3b..2aa0184 100644 --- a/include/ck_sequence.h +++ b/include/ck_sequence.h @@ -101,7 +101,7 @@ ck_sequence_write_begin(struct ck_sequence *sq)

It's not obvious to me yet why this breaks my code, but it does. If I retain the 0.7.0 ck distribution but replace only ck_sequence.h with the old version from 0.4.5, then my code works properly again.

Regards, Andy

ajschorr commented 9 months ago

To be honest, I suspect that this patch is fine and that there's a weird race condition in my code that this change somehow exposes. I need to work on a test case.

pkhuong commented 9 months ago

Right, the commit definitely still looks good to me. I can think of a few ways your program could be broken:

  1. There are actual concurrent writes to the sequence lock. In that case, the program was always broken, but it's more obvious now. We can easily add a spinlocked version of ck_sequence_write_begin.
  2. The atomic increment's implicit fence prevented loads from being executed before the sequence write block. Without that implicit fence, the program's lack of store/load fencing becomes evident (I'm assuming a TSO memory model, relaxed memory models can probably break in many more ways)
  3. The writes under the sequence lock didn't actually need the sequence lock logic (i.e., stores and loads can't tear), so the new code does highlight incorrect sequence lock usage, but one that's kind of irrelevant because the seqlock isn't useful and should simply be removed or replaced with a version counter.
ajschorr commented 9 months ago

Thanks for your thoughts. I think it's likely the 2nd issue you pointed out where the loss of an implicit fence is revealing a problem. Just trying to figure out the proper fix, although using the atomic increment does seem to fix the problem without much if any loss of performance. This is on x86_64.

ajschorr commented 9 months ago

So this is a fairly ignorant question, but was that atomic increment giving me the equivalent of a strict memory fence (ck_pr_fence_strict_memory)? It seems like adding one solves my issue, but I'm basically just stabbing in the dark, since I don't usually do this type of coding.

ajschorr commented 9 months ago

In any case, this does seem to be a problem elsewhere in my code. I can fix it with a call to ck_pr_fence_memory, but oddly the performance hit is pretty bad, whereas the old atomic increment solved the problem with negligible performance impact. But I guess it wasn't quite right. Sigh. Anyway, I'm closing this.