Closed maoif closed 9 months ago
The original intent when adding cas!
was to expose the underlying machine operation, which means allowing spurious failure on architectures that have them. But I didn't document that choice clearly, and as the broken tests illustrate, it's easy to forget about the possibility of spurious failure.
It might be a good idea to make the cas!
operations retry on some failures. The result of a cas!
operation currently doesn't indicate whether the failure was due to a different value in place than the given one or whether it's due to a (possibly spurious) failure of atomicity. The machine code can make that distinction and retry only in the latter case. Making that distinction within ftype-lock!
would make it different than ftype-spin-lock!
, still.
I've never tried this change, though, because I'm not sure it's a good idea. An atomicity failure isn't necessarily spurious, and automatically retrying might cause cas!
to loop when it should just return failure. Looping might take some choice away from applications. Maybe it could even cause a program to get stuck when it doesn't have to; I'm not sure. C++ provides both string and weak variants, which suggests that the strong operation can be sensible, but also that the weak operation can be useful. Meanwhile, C compiler intrinsics tend to offer only the weak version of this operation; that's an issue, because there are places where those get used within the kernel or in the Chez Scheme C API. See "mkheader.ss" and _InterlockedExchange64
for one example. (I cannot tell whether _InterlockedExchange64
specifically is meant to allow spurious failure, though.)
Either the docs and tests need to be fixed, or the implementation of cas! and
ftype-lock!` operations needs to change, but I'm not sure which is the best path. I lean toward fixing the documentation and tests, because I think these operations are so low-level and rarely needed that it isn't worth the effort to provide easier-to-use variants.
I investigated the error related to
{vector, box}-cas!
andftype-lock!
in loongarch and riscv backend, and found they also exist in Racket's arm64 backend. I ran the following code:The seesion on loongarch is something like:
If we decrease the number of times, we may succeed:
The situation is similar on other two architectures.
The generated machine code under optimization level 3 for
(box-cas! bx 1 4)
on riscv is:Offsets are in bytes and all instructions are 4 bytes wide.
The code is run is a single thread, how can it fail? I checked some sites (1 2 and 3), and found that the above phenomenon is called the "spurious failure" of weak compare-and-swap, which is how our
cas!
is implemented. So even we run the code in a single thread and the given value is equal to the boxed value, and no other cores have touched the addresses reserved by the paired load-reserved, store-conditional will still fail.For weak CAS, if the store-conditional fails, it will just quit and the set the status flag. In the strong CAS, if the store-conditional fails, it will loop back to the load-reserved/linked, and retry, until the store-conditional succeeds. In x86, CAS is implemented using a single
locked-cmpxchg
, hence no spurious failure.In the mats tests
5_6.ms line 1268
and5_8.ms line 41
, the two tests fail/succeed randomly. I suppose the tests assume that thecas!
is strong, but the implementation is weak (also in arm32, powerpc and Racket's arm64).In addition to
{vector, box}-cas!
, the testftype.ms line 2790
and2796
also assume thatftype-lock!
acquires the lock "strongly", but in fact the implementation (using store-conditional without retry loop) is "weak". Yet if we change theasm-lock!
implementation to contain a retry loop, we are inventing anotherftype-spin-lock!
.So maybe we should 1) make the CAS assembler primitives strong by adding loops, and 2) assume that
ftype-lock!
is weak and modify the tests.Any other nice ideas?