crossbeam-rs / rfcs

RFCs for changes to Crossbeam
Apache License 2.0
146 stars 13 forks source link

Explaining why Crossbeam is correct w.r.t. C11 memory model #6

Closed jeehoonkang closed 7 years ago

jeehoonkang commented 7 years ago

Here is the promised RFC on the concurrency memory model and correctness of Crossbeam.

Rendered

jeehoonkang commented 7 years ago

I addressed @stjepang 's comments. Thank you!

Argh, I omitted this comment:

Also, the proof could benefit from a drawing of some kind of timeline of atomic fences/operations (ASCII art would be just fine), since there's a lot of stuff going on.

I will do it this week.

jeehoonkang commented 7 years ago

I added an ascii art on an example timeline. Thank you!

arthurprs commented 7 years ago

Nice work! I'm just coming back from vacations and I plan to review this soon.

ghost commented 7 years ago

I gave the text another, more detailed read. As far as I'm concerned, it's almost good to go.

Let's just add another subsection to Alternatives and mention that it's possible to eliminate the SeqCst fence in unlink() by changing the invariant from G >= X+2 to G >= X+3.

schets commented 7 years ago

Let's just add another subsection to Alternatives and mention that it's possible to eliminate the SeqCst fence in unlink() by changing the invariant from G >= X+2 to G >= X+3.

I haven't had time to read this in detail yet, but if this statement is true, we should almost certainly do this. On x86, this is the only type of fence that results in a hardware fence and it's a performance disaster, and on ArmV8 and Power this fence is more expensive than acquire/release fences.

arthurprs commented 7 years ago

it's possible to eliminate the SeqCst fence in unlink() by changing the invariant from G >= X+2 to G >= X+3.

Very interesting. Did you toyed with that yet? This potentially makes pinning much cheaper on non x86 (as x86 can use a lock CAS).

ghost commented 7 years ago

Very interesting. Did you toyed with that yet? This potentially makes pinning much cheaper on non x86 (as x86 can use a lock CAS).

Did you mean unlinking instead of pinning?

I didn't toy with this much, but: in Coco garbage is tagged with the current epoch only when flushing thread-local bags into the global queue. The whole bag is tagged with the current epoch. Since tagging is so infrequent, its effect on performance is negligible.

In Crossbeam we might employ a slightly different strategy and tag garbage more often, but it's still probably unnecessary to tag absolutely every garbage object with the current epoch. For that reason eliminating the SeqCst fence possibly won't get us any noticeable performance wins...

arthurprs commented 7 years ago

Indeed, I got it all wrong. I need extra :coffee: before looking into this thread .

jeehoonkang commented 7 years ago

I mentioned @stjepang's proposal (to remove the SC fence from unlink()) in the "Unresolved Questions" section, because I am still not fully convinced that the proposal is correct. (I agree with @schets on the performance of the proposal, even considering the fact that unlink()'s SC fences are amortized.) I suggest we continue the discussion of https://github.com/crossbeam-rs/rfcs/pull/6#discussion_r130942541 in another RFC.

jeehoonkang commented 7 years ago

Thank you for all the inputs so far. @schets @Vtec234 may I ask if you have further comments?

jeehoonkang commented 7 years ago

it would also be great to show why a pointer that is currently in use cannot be freed.

@Vtec234 Thank you for the suggestion. I slightly revised the text so that it clearly explains that all the accesses to obj should happen before its deallocation.

jeehoonkang commented 7 years ago

I am merging it based on 4 approvals :) Thank you for all the comments!