Closed schernichkin closed 5 years ago
The function you've described is not atomic. If that function exists in Haskell, it isn't atomic either, because MVar
is described by put
, take
and read
.
And this makes it dangerous.
Wouldn't such a use-case be better for Ref
?
Yes, it's not atomic in presence of other producers (what stated in Haskell's documentation), however it widely used for synchronizing access to mutable structures. In particular when multiple writers want to update single variable they only allowed to do so via modifyMVar, no sole take and put operations allowed.
A practical example can be found here. As you may see all operations synchronized using withMVar/modifyMVar and no sole take/put functions.
Right, but we have Ref
, which is a powered up version of atomicModifyIORef
, and that's a better fit for that use case I think
But Ref
not suitable for protecting mutable resources, while MVar
allows it.
If you mean cases where you actually need a mutex, you can use Semaphore.withPermit
, or build something with Ref + Deferred
like this one:
https://github.com/ovotech/fs2-kafka/blob/master/src/main/scala/fs2/kafka/internal/Synchronized.scala
If you mean cases where you actually need a mutex, you can use Semaphore.withPermit
With mutex I can build any synchronization primitive, by this logic I do not need MVar at all :) The idea was to express other primitives using MVars, because it easy to do.
or build something with Ref + Deferred like this one
It's so complicated that I can't even understand how it exactly works. Using MVar it could be:
def withMVar[B](mvar: )(f: A => F[B])(implicit F: Bracket[F, _]): F[B] = {
F.bracket(mvar.take)(f)(mvar.put)
}
object Synchronized {
def of[F[_]: Concurrent, A](a: A) = MVar.of(a).flatMap { mv =>
new Synchronized[F[_], A] {
override def use[B](f: A => F[B]): F[B] = withMVar(mv)(f)
}
}
}
, by this logic I do not need MVar at all
In fact my opinion is exactly that you don't need MVar
at all, and using Ref + Deferred
is better (Semaphore itself is done with Ref + Deferred, for example).
t's so complicated that I can't even understand how it exactly works.
Ref + Deferred
is a way simpler model than MVar
, because there are fewer state transitions and fewer places you can block, therefore fewer places where you can deadlock, which is quite easy to do with MVar
.
The idea was to express other primitives using MVars, because it easy to do.
I quite strongly disagree here, and this is evident when you try and build actual complicated structures like Queues
and Topics
, which is why I introduced Ref + Promise
in fs2 instead of going with MVar
(and they then made it into cats effect, and Promise
was renamed to Deferred
).
Scalaz IO used to have MVar
as well, and they also moved to a model based on Ref + Promise
because it's way simpler. You just need to learn how to design things with it, just like you have to learn how to design things with MVar
. I have done both, and that's why I don't like MVar
anymore :)
(although obviously there are always going to be cases where one thing is better than the other)
It's very easy to get a deadlock with withMVar
: withMVar(mvar)(_ => mvar.take)
. This mvar.take
can be somewhere deep in your f
.
It's just very easy to get a deadlock with MVar
in general:
An MVar is actually quite a complicated primitive, because it has transitions in both ways (empty to full and full to empty), an unlimited number of times, and with (semantic) blocking in both directions.
If you consider that most primitives require more than one MVar, you quickly need to keep in mind a pretty big state space of potentially blocking transitions. This is made ever more complex by the fact that some things that in theory don't require waiting, like concurrent mutation, are done via semantically blocking primitives in MVar, case in point modifyMVar
done with put
and take
.
At the heart of the issue is the fact that MVar conflates two different things: concurrent mutation, and synchronisation.
The Ref + Deferred
model instead is very orthogonal: you have the simplest primitive possible for concurrent mutation (Ref
) and the simplest possible primitive for synchronisation (Deferred
).
Complex behaviour is then built compositionally using those two, and the state space is reduced:
Deferred
only has one state transition (empty to full, and only once), and one place you can block (get
on empty). If you need more transitions, you manage them compositionally using Ref
, which offers a simpler model for concurrent mutation: atomic modify
with no waiting.
So basically everything becomes Ref[F, DataStructureWithDeferreds]
, which gets a lot more manageable, both in theory (given the argument about orthogonality I just gave) and in practice (given in depth anectodical experience designing and implementing primitives with both in fs2 and Haskell).
@SystemFw So, is your point is not to introduce additional utility functions to discourage people from using MVars? (btw thanks for Ref + Deferred pattern, I've finally got it)
@wedens MVars are not deadlock-free by design. One can easy deadlock with mvar.take >>= _ => mvar.take :)
Another thing with your Synchronisation implementation with MVar compared to the one I linked you to is the behaviour wrt cancelation.
In the MVar version you cannot try to acquire the Synchronisation with a timeout (because bracket
acquire
is uncancellable, and you take
in there), whereas the Ref + Deferred
version allows you to do that.
Semaphore
had the same problem, and it required an internal fix.
is your point is not to introduce additional utility functions to discourage people from using MVars
No, I'm not that strongly against them (even though fs2 doesn't use them and I wouldn't use them), but if the utility functions actually encourage problematic behaviour (like they do in this case) and we have alternatives (like we do in this case), then I'd rather not have them, so that I don't have to explain to users not to use them. Other people might have a different opinion, which is why MVar is in cats-effect in the first place.
But basically I'm just answering the question:
So, why not in API, is it intentional decision?
to which the answer is yes :)
and we have alternatives (like we do in this case)
So, what about bringing these alternatives to the library? Something like mentioned Synchronized
with use[B](f: A => F[B]): F[B]
implemented with Ref + Deferred
? Or, even modify[B](f: A => F[(A, B)]): F[B]
?
Because the first thing which newcomer will look at if he will need "functional mutex" will be MVar because many people know about them from Haskell. Not finding bracketed functions where, he will probably reimplement or ask here as I did.
Something like mentioned Synchronized with
Well, but in this case you can just do Semaphore.withPermit
Btw, I'm looking at the given function and I was initially saying that it isn't atomic:
def modifyMVar_[F[_], A](mv: MVar[F, A])(f: A => F[A])(implicit F: Bracket[F, _]): F[Unit] =
F.bracketCase(mv.take)(a => f(a) >>= mv.put)((prev, e) => e match {
case ExitCase.Error(_) | ExitCase.Canceled => mv.put(prev)
case ExitCase.Completed => ().pure[F]
})
However if take
succeeds, then other actors are blocked from doing take
. Other actors are only allowed to put
. So this function is safe only as long as take >> put
is the sequence all actors are using.
π€
In other words, maybe it's not a bad function to have in some contexts. Maybe with big warnings in the ScalaDoc.
@SystemFw
Don't you think that specialized implementation of modify[B](f: A => F[(A, B)]): F[B]
on Ref + Deferred
could be faster and simpler? I will also need separate variable to store mutable state with semaphore, I don't need more than one permit. Few minutes ago you criticized mvars but now suggest primitive which has even more state transitions:))
Ah, related to what I said F.bracket(mvar.take)(...)(mvar.put)
will not behave like Semaphore's withPermit
because this will make mvar.take
to be uninterruptible and this can be a problem, because take
also waits for a value to be available.
Few minutes ago you criticized mvars but now suggest primitive which has even more state transitions:))
More variables β more state transitions.
If you have a Ref
plus a Semaphore.withPermit
there are no state transitions visible to you (other usage patterns of Semaphore
are worse than MVar
though, agreed on that).
That being said, my preference is to build things with Ref + Deferred
directly, which is why that Synchronisation
thing looks the way it looks (I helped design it) and doesn't use Semaphore
.
I guess the question is: do we want a Synchronisation
thing in cats effect?
Note that often, you can use Ref + Deferred
directly to build whatever you need to build withSynchronisation
instead of using that. The use case for that is literally synchronised access to a mutable thing (like a Kafka consumer), which isn't that common of a use case, and if you need to do mutation on an immutable thing, then it's Ref
again :)
The use case for that is literally synchronised access to a mutable thing (like a Kafka consumer), which isn't that common of a use case
Why do you think it uncommon? I see following use cases:
It just capturing idea of acquiring lock, performing effectful action on locked resource and releasing log regardless of exception/cancelation. Just like Resource
do with disposable resources.
Well, it's not like use cases don't exist, and in fact we do have Semaphore.withPermit
, which is simple to use and behaves well wrt cancelation. Note that both your use cases are covered by it, although I'd argue case 1 is not that common (either you use Ref
, or you don't have concurrency, or you have a data structure that already supports concurrency like ConcurrentMap
).
The final question. MVar
s was introduced to Haskell as basic block for building complex syncronization structures (simulating mutexes, building chans, qsems, even I-Var implementations use MVars internally 1 2)
Since cats-effect provides Ref
and Deferred
naturally and most basic MVar patterns will not handle cancelation well (for instance, everything performing blocking operations in bracket
's acquire/release blocks) is there any use for MVars
at all? If there is no syncronization patterns which can be expressed by MVar
but not by Ref
+ Deferred
(I guess it's quite easy to prove by expressing MVar
itself with Ref
and Deferred
) should not MVar
s be deprecated?
P.S. Digging to the origins of MVar
s I've found ID Language Reference Manual where it seems first was introduced. Original M-structures has slightly different semantics (they dont block on put
, putting to non-empty M-structure result runtime error). But the most interesting thing is:
In typical usage of M-structure components, several concurrent computations share a resource. Each computation takes it, computes with it, and puts it back. The semantics guarantees that each computation has exclusive access to the resource. Note: for correct operation, every take must be followed by a put , i.e., it is the programmer's responsibility to make sure that these operations are balanced.
So, bracketing MVars with get/put operations to guarantee that every taken MVar will be putted back is exactly how MVar's initially supposed to be used. At since everyone here considers that these
functions actually encourage problematic behaviour
probably MVars just not suitable for cats-effect computation model?
@schernichkin What you want is ZIO's RefM
, which looks almost exactly like Ref
but the update / modify functions can be effectful. So you can use mutable data in the Ref or just effectfully change immutable data.
@schernichkin
most basic MVar patterns will not handle cancelation well (for instance, everything performing blocking operations in bracket's acquire/release blocks) is there any use for MVars at all?
Of course there are, at the end of the day MVar
is like a BlockingQueue(size=1)
and you can model all sorts of producer/consumer pipelines with it.
As to the problems we are having with bracket
, that's actually a problem with bracket
that we'll have to solve either by introducing a version of bracket
that doesn't execute acquire
uninterruptibly, or by introducing versions of put
/ take
that work well with this bracket
(which is possible if you change their signatures) ... this later approach is what we did in Semaphore
's withPermit
, but it's for the moment internal.
If there is no syncronization patterns which can be expressed by MVar but not by Ref + Deferred (I guess it's quite easy to prove by expressing MVar itself with Ref and Deferred) should not MVars be deprecated?
So first of all you don't even need Deferred
, you just need Ref
. Stop thinking for a second at Haskell and think of mainstream languages instead. A Ref
is nothing more than an AtomicReference
, which in itself is nothing more than a variable that's being read with volatile
semantics or modified via compareAndSet
or getAndSet
which get translated to platform intrinsics.
This is the primitive building block for doing synchronization on top of the JVM and most platforms similar to the JVM, being linked directly to the JVM's memory model (the JMM). You can build locks and semaphores with atomic references. You can do shared-transactional memory in terms of atomic references. You can do pretty much everything.
Forget Deferred
, because in fact Deferred
too can be expressed with Ref
.
But the problem is that Ref
is often too low level, plus solutions built on Ref
can suffer from performance issues too. Building a concurrent queue on top of Ref
for example is possible, but has poor performance characteristics compared with something built to take advantage of Java's memory model directly.
So no, we should not deprecated MVar
. It's an awesome tool imo.
@jdegoes
@schernichkin What you want is ZIO's RefM, which looks almost exactly like Ref but the update / modify functions can be effectful. So you can use mutable data in the Ref or just effectfully change immutable data.
We are in the context of the Cats-Effect project π so if you'd like to propose RefM
for Cats-Effect, I'd be happy to hear it.
I think we need to distinguish between implementation concerns, and what sort of model we'd like to propose for our users, for most of which our abstractions really appear as "primitives".
Implementation wise, yeah all you need is Ref
(AtomicReference
) and async/asyncF
. Everything else can be implemented on top of it. In terms of the programming model however, I actually disagree with
So no, we should not deprecate MVar. It's an awesome tool imo.
I think it's just as low level as the alternative Ref + Deferred
model, and way more error prone (although obviously there are going to be some exceptions in select use cases). So, if it was entirely up to me and we didn't have any concern about breaking code for our users, I wouldn't have any issues deprecating MVar
. Fortunately, both these conditions don't hold :)
ut has poor performance characteristics compared with something built to take advantage of Java's memory model directly.
I think this point is fairly orthogonal to this discussion, and also various caveats about "poor" apply, although I'd definitely agree with "poorer.
We are in the context of the Cats-Effect project π so if you'd like to propose RefM for Cats-Effect, I'd be happy to hear it.
Agreed. Also, I think I like Synchronization
's implementation strategy better, seems lighter weight (RefM
is basically Actor
, so you have a queue and a consumer fiber).
probably MVars just not suitable for cats-effect computation model?
@schernichkin Note that the recommendation in Haskell is to not use MVar
by default, but STM
, exactly because coordinating several MVar
s gets hard really quickly. MVar
should be preferred for lower level things, especially when you need fairness, which is its main selling point (according to Simon Marlow)
Of course there are, at the end of the day MVar is like a BlockingQueue(size=1) and you can model all sorts of producer/consumer pipelines with it.
That's why MVar
should be deleted IMO. It's a special case of a blocking queue. And people wouldn't think to (abuse) a "queue" to do some of the things they are tempted to do with MVar
(e.g. modifyMVar
).
We are in the context of the Cats-Effect project π so if you'd like to propose RefM for Cats-Effect, I'd be happy to hear it.
Maybe after cats-concurrent
exists. π
@jdegoes Thanks, ZIO looks promising.
@alexandru
that's actually a problem with bracket that we'll have to solve either by introducing a version of bracket that doesn't execute acquire uninterruptibly
Haskell took this approach thatβs why withMVar
works fine in Haskell. But it ended up with mask
which masks asynchronous exceptions but still can be interrupted if blocking function invoked and uninterruptibleMask
which will unconditionally mask all exceptions. Not the best choice in my opinion, you may never know if computation uses blocking functions somewhere internally.
Stop thinking for a second at Haskell and think of mainstream languages instead.
Every programming language relies on synchronization primitives provided by OS to implement non-busy waiting, including Haskell. You could express locking primitives with sole CAS and sleep() but sleep() itself should be provided by OS. And Iβm talking about something practical. From this point of view sole Ref
is not enough.
and sleep() but sleep() itself should be provided by OS
That's not true. green threading systems like the ones in Haskell or cats-effect can express sleep
, and semantic waiting in general, provided you have
ExecutionContext
, at the end of the day)async/asyncF
(and which you can build once you have your own call stack, which is what any IO
runloop is)Once you have Ref
and asyncF/cancelableF
, you can build Deferred
, and from there you can build the rest.
EDIT: actually, strictly speaking, you don't even need a threadpool, since cats-effect concurrency works on scala.js too
How you going to pass control to OS when application has nothing to do? sleep() (or any other non-busy wait mechanics) should be hidden somewhere, possibly in thread scheduler, and on the OS level it should call something like HLT. Otherwise you will end up in spinlocks, boiling up processor.
How you going to pass control to OS when application has nothing to do? sleep() (or any other non-busy wait mechanics) should be hidden somewhere, possibly in thread scheduler, and on the OS level it should call something like HLT. Otherwise you will end up in spinlocks, boiling up processor.
Again, no. You are describing Thread.sleep
, i.e. actual thread blocking, whereas IO.sleep
or MVar.take
or Deferred.get
don't block a thread. The key idea is that what you see as blocking
on one level is actually suspending
on the level below.
So when your fiber "sleeps" it's literally giving control back to the thread pool after encountering an Async
node in its runloop, giving a chance to other fibers to run . Maybe I misunderstood your question, you do need one thing that can schedule things, (for example a ScheduledThreadExecutor
), but you do not need a Thread.sleep
Ah, I see.
@SystemFw
Agreed. Also, I think I like Synchronization's implementation strategy better, seems lighter weight (RefM is basically Actor, so you have a queue and a consumer fiber).
The Synchronized
sample escaped me. So 2 things:
Point number 1 is important because it doesn't really matter how you implement a mutex, in terms of Ref
or in terms of MVar
, it's still going to represent a bottleneck.
Point number 2 is important because it shows that Ref
is not all you need for building Synchronized
;-)
Just to be clear, I'm very much against deleting MVar
. If nothing else, MVar
is important for people like @schernichkin that want to migrate logic from Haskell. And Ref
is cool, but you can barely go lower level than Ref
on top of the JVM, implementations based on Ref
being anything but elegant or simple.
So that said, our discussion does not address the main problem of the described function. We need to establish in Cats-Effect what we have to do for this particular pattern:
bracket
with an cancelable acquire
?take
and put
versions with signatures that can work with the current bracket
(what we did in Semaphore's withPermit
internally)?I have a feeling the solution is probably going to be number 2, because number 1 is a can of worms.
@alexandru
Mutexes are in general bad, a solution of last resort
Oh, I definitely agree with that. And in fact that Synchronisation
is used only because there's no other way (that I know of) of dealing with the Kafka api in that situation (there's a ThreadId check).
Just to be clear, I'm very much against deleting MVar
As I said, I would prefer if MVar was not there, but I appreciate there's no consensus, and it's not a battle worth fighting, it can stay there :)
I have a feeling the solution is probably going to be number 2, because number 1 is a can of worms.
I agree. That being said, that Synchronisation
does avoid that problem without requiring changes to the primitives, which is another point in favour of Ref
+ Deferred
imho.
@alexandru
Before introducing any "MVar-friendly" bracket
I think your should consider a Haskellers experience of mixing of resource handling and cancelation: https://haskell-lang.org/tutorial/exception-safety Yep, it seems to be implementable (http://hackage.haskell.org/package/unliftio-0.2.10/docs/UnliftIO-Exception.html), but it defently not looks simple.
@schernichkin thanks, that article looks useful.
Mutating MVars properly is a common task in multithread programming. In Haskell such functions are absolutely necessary, because in presence of asyncronous exceptions you should be extremely carefull with masking and unmasking. But even in cats.effect I see modifyMVar_ as
to handle cancelation and synchronous exceptions correctly. So, why not in API, is it intentional decision?