Closed sdmaclea closed 3 years ago
I'll also ping the GC team separately to follow up on the question I raised re: how these APIs interact with the card table.
@GrabYourPitchforks These should be orthogonal to the writebarriers required by the CG on heap allocations, heap reference writes....
@sdmaclea For the most part I agree, but my concern was regarding the Exchange<T>
and CompareExchange<T>
APIs in particular. If the GC's going to force a particular memory ordering when updating the card table, then it could affect which memory orderings are valid for those two specific operations.
That may turn out to be an implementation detail. Where the Pointer/Reference forms must use a more restrictive memory ordering, but the user can request a weaker ordering. ExchangePtr and CompareExcchangePtr are already handled differently, so your concern is very valid.
@stephentoub @CarolEidt Regarding the case where these methods are called with a non-immediate value for MemoryOrder
, would it be feasible to say that they should just generate SequentiallyConsistent
instead of switching on the non-immediate value at runtime? For example:
public static int Increment(ref int location, MemoryOrder mo) => Increment(ref location);
public void MyMethod(ref int foo)
{
// Passing an immediate; the JIT can generate appropriate assembly.
Interlocked.Increment(ref foo, MemoryOrder.Relaxed);
// Passing a non-immediate; this is treated as a normal method call instead of an intrinsic,
// so in the end it just turns into a standard Increment(ref int) call.
MemoryOrder mo = (MemoryOrder)(new Random().Next());
Interlocked.Increment(ref foo, mo);
}
I wonder if this would simplify the logic a bit. For reference, there was discussion of having a code analyzer that would flag non-immediates passed to some of the hardware intrinsic APIs. (See https://github.com/dotnet/coreclr/issues/15795#issuecomment-356431086.) That may be useful here as well.
Are we sure the enum members are sensibly named? The names do not make a lot of sense to us, but that might just domain knowledge. Also, @GrabYourPitchforks just noticed that Release
is being deprecated. Before approving I'd like to get confirmation on the names.
@terrajobst I see no indication that release
is deprecated. The https://en.cppreference.com/w/cpp/atomic/memory_order could lead one to believe that where the enums defining the memory order are changed to enum + constexpr in C++20.
@sdmaclea Got it, I misinterpreted the (until C++20) marker as applying specifically to memory_order_release
instead of to the entire memory_order
typedef. That's my mistake.
would it be feasible to say that they should just generate SequentiallyConsistent instead of switching on the non-immediate value at runtime? ... I wonder if this would simplify the logic a bit.
I don't see how this would simplify the logic, The existing intrinsic support in the JIT makes heavy use of the "non-immediate value falls back to recursive case which JIT expands" approach. It works and is relatively straightforward.
Are we sure the enum members are sensibly named? The names do not make a lot of sense to us
They make sense to me - and I believe they are consistent with general usage, not just in C++, but in memory model discussions.
I don't see how this would simplify the logic, The existing intrinsic support in the JIT makes heavy use of the "non-immediate value falls back to recursive case which JIT expands" approach. It works and is relatively straightforward.
What I meant is that the implementation could look like this:
[Intrinsic]
public static int Method(..., MemoryOrder order) => Method(..., MemoryOrder.SequentiallyConsistent);
You'd still have the recursive call, but this basically turns into "if the JIT can't determine the literal value of the MemoryOrder
parameter, it says screw it and treats the call site as if it were sequentially consistent." That avoids us having to write a switch statement inside the method implementations.
I do not think MemoryOrder.Consume
should necessarily be included, it does not seem to be well-defined in C++ and the standard recommends not using it:
memory_order::consume : a load operation performs a consume operation on the affected memory
location. [Note: Prefer memory_order::acquire , which provides stronger guarantees than memory_-
order::consume . Implementations have found it infeasible to provide performance better than that of
memory_order::acquire. Specification revisions are under consideration. —end note]
The C++ standard committee have also had problems defining memory_order::relaxed
. Hans Boehm talks about both memory_order::relaxed
and memory_order::consume
here (he calls memory_order::consume
a failed experiment): https://www.youtube.com/watch?v=M15UKpNlpeM&feature=youtu.be&t=1283
While trying to stabilize the thread pool for linux-arm64 during the release 2.1 effort, it became apparent that the safest thing would be to assume that existing code assumed an interlocked operation guaranteed barrier to enforce sequential consistency at least with respect to operations before and after the interlocked operations.
While this approach is likely to guarantee functional correctness in the most legacy code, it does come at a significant cost to weakly ordered machines. Also it is actually rare that an Interlocked operation would actually need to guarantee sequential consistency.
This proposal adds a
MemoryOrder
parameter to each atomic interlocked operation.The proposal currently does not show the
MemoryOrder
parameter with a defaultMemoryOrder memoryOrder = SequentiallyConsistent
because those API already exist and can be presumed to continue to exist in order to support NetStandard2.1 and earlier.