Open treeowl opened 4 years ago
A while ago, I wrote primitive-atomic, which just has wrapper for the atomic primops. It's not as sophisticated as atomic-primops
. The obvious lifted wrapper for casArray#
is just:
casArray :: PrimMonad m
=> MutableArray (PrimState m) a -- ^ prim array
-> Int -- ^ index
-> a -- ^ expected old value
-> a -- ^ new value
-> m (Bool,a)
{-# INLINE casArray #-}
casArray (MutableArray arr#) (I# i#) old new =
primitive $ \s0 -> case casArray# arr# i# old new s0 of
(# s1, n, r #) -> (# s1, (isTrue# (n ==# 0# ),r) #)
Although, as you point out, CAS on lifted objects is useless on types that have no notion of identity. If the elements in the array were MutVar#
or MutableByteArray#
though, then CAS is actually sound.
CAS primop casArray#
works well with boxed arrays, except for when compiled with -fhpc
ghc#18289
Here is a potential implementation:
atomicModifyArray# :: MonadPrim s m => MutableArray s e -> Int -> (e -> (# e, b #)) -> m b
atomicModifyArray# ma@(MutableArray ma#) i@(I# i#) f = do
current0 <- readArray ma i
primitive $
let go expected s =
case f expected of
(# new, artifact #) ->
case casArray# ma# i# expected new s of
(# s', 0#, _ #) -> (# s', artifact #)
(# s', _, actual #) -> go actual s'
in go current0
So, I wouldn't say it is useless.
CAS on lifted objects is not useless. It can, however, be subtle, as the evaluate
or lazy
in my sample code above hints at. The Ticket
approach of atomic-primops
doesn't seem terrible if it's done with a bit more (typically erased) boxing and (especially) by using lazy
instead of all that unsafe coercion nonsense. Something like
data Ticket a = Ticket a
peekTicket :: Ticket a -> a
peekTicket (Ticket a) = lazy a
peekTicket# :: Ticket a -> (# a #)
peekTicket# (Ticket a) = (# lazy a #)
CAS primop
casArray#
works well with boxed arrays, except for when compiled with-fhpc
ghc#18289Here is a potential implementation:
atomicModifyArray# :: MonadPrim s m => MutableArray s e -> Int -> (e -> (# e, b #)) -> m b atomicModifyArray# ma@(MutableArray ma#) i@(I# i#) f = do current0 <- readArray ma i primitive $ let go expected s = case f expected of (# new, artifact #) -> case casArray# ma# i# expected new s of (# s', 0#, _ #) -> (# s', artifact #) (# s', _, actual #) -> go actual s' in go current0
So, I wouldn't say it is useless.
Yes, that's an alternative implementation, similar to one in atomic-primops
. It's more efficient than the one I gave under low contention, but under high contention it allocates a bunch of thunks that get thrown away on CAS failure.
Yeah, I don't think atomic modification is a good use case with high contention, even atomicModifyIORef'
performs rather poorly when too many threads try to modify it at the same time. On the other hand with low contention atomicity guarantees are quite valuable, so it is a bit surprising that primitive
does not provide those out of the box.
@lehins, what goes wrong with atomicModifyIORef'
under contention? Are there a lot of CAS misses for some reason I can't think of? Or do thunks start to build up because multiple modifications succeed before the result of the first one gets forced?
@treeowl Remember this SO question, which I think is somehow related to this and also I've done some benchmarks at some point before and I found that with increasing number of threads performance drops pretty significantly. Not sure why that is as I did not investigate it much further. However I know I will come back to this at some point in the future because I have some interesting work-in-progress for an API that amongst other things provides a unified interface for atomic modification of various mutable types.
For some reason, this package doesn't offer basic access to atomic CAS and such. I see no good reason for this. The
atomic-primops
package has some good ideas for this, but its implementation is flaky under optimization and its author does not seem responsive to suggestions for fixing that.An interesting (?) thought: I think we can write a somewhat-inefficient
atomicModifyIORef
-like function for array elements using the same sort of implementation trickery but in user-land. Very roughly speaking: