Consider a Type class for resource safety

LukaJCB commented 6 years ago

I raised this issue on the gitter channel and got a lot of positive feedback. We removed IO#ensuring some time ago and realized MonadError is not powerful enough to implement it (See here). I'd like to propose something in the vein of MonadBracket described in this article. It might also help with providing a resource safe Parallel experience.

Relevant gitter discussion:

LukaJCB @LukaJCB 14:22 Hey everyone, I kind of miss IO#ensuring and I realize why it’s gone, but maybe we should add something to the cats-effect type class hierarchy to ensure resource safety. It can’t be added to MonadError, so maybe something like the MonadBracket described here: https://www.fpcomplete.com/blog/2017/02/monadmask-vs-monadbracket I can’t really claim to be super knowledgable about this, so I’d love to hear what you think!

Fabio Labella @SystemFw 14:23 that's kind of a big issue fwiw, I kind of prefer a resource safe F a la scalaz IO as well I can see reasons to have a simple F, and delegate this aspect to Stream

LukaJCB @LukaJCB 14:24 Right now, without ensuring it’s pretty difficult to do

Fabio Labella @SystemFw 14:24 with ensuring as well

LukaJCB @LukaJCB 14:25

Michael Pilquist @mpilquist 14:25 +1 from me for MonadBracket -- @djspiewak may have objections

Fabio Labella @SystemFw 14:25 you need to change the internals of your IO type to support this which I'd agree with I suspect Daniel doesn't I think our life in fs2 would be a tad easier if some thing (interruption, resource safety) would be at the F level

Michael Pilquist @mpilquist 14:26 Note that stream libraries still need their own, separate definition of bracketing as MonadBracket doesn't distribute over Stream

Fabio Labella @SystemFw 14:26 sure

LukaJCB @LukaJCB 14:26 I wouldn’t personally mind if IO would’t have it, but I’d love to see a type class that supports it. I’m sure something like Monix Task could make use of it

Fabio Labella @SystemFw 14:26 well, then you'd have IO not implementing one of the cats effect typeclasses so it isn't a reference implementation anymore

Michael Pilquist @mpilquist 14:27 Yeah I'd definitely want IO to implement it BTW, we're about to add an AtomicReference backed version of Ref to fs2 (called SyncRef) but maybe we should consider moving that to cats-effect and using it to implement the Parallel instance

LukaJCB @LukaJCB 14:29 I’d be +1 on that I think @alexandru said he had a Proof of Concept on a Parallel instance usign Java atomic refs as well I’ll create a ticket

Michael Pilquist @mpilquist 14:32 fs2 needs SyncRef for 0.10 final but otherwise, I think this stuff could be post cats-effect 1.0. Maybe with exception of parallel instance

Fabio Labella @SystemFw 14:33 btw interruption is crucial here from what I know from haskell + reading scalaz IO + using my imagination I'd like to know if this assumption is wrong crucial to implement resource safety, that is or rather, the two are closely linked

LukaJCB @LukaJCB 14:36 Interruption as in, cancelling a running computation?

Fabio Labella @SystemFw 14:38 yes think about race in the why-not-both example (and some way of storing finalisers as well) in any case I'd like to hear what Daniel and Alexandru think

LukaJCB @LukaJCB 14:40 Yeah, but we’d likely run into the same problem using Parallel, no?

Fabio Labella @SystemFw 14:40 that's kindof what I'm saying interruption, resource safety and concurrency are linked

jdegoes commented 6 years ago

There can be no principled, leak-free, monadic concurrency in functional programming without MonadBracket and MonadFork, or equivalents, including interruption semantics analogous to Scalaz 8 / Haskell / PureScript.

Interruption is fundamental to composability, and must be baked into the lowest layer of the stack, which is the effect monad that drives the application.

Not only is such a thing possible to do in a lawful fashion with precise semantics, but it has been done, in Scalaz 8 IO and elsewhere. The fact that existing libraries do not support these semantics is irrelevant because (a) existing libraries can always be improved, or alternatively (b) type class laws can be weakened so as to permit "no op" implementations.

I happily donate the following type classes to the project:

trait Forked[F[_], A] {
  def interrupt(t: Throwable): F[Unit]
  def join: F[A]
}

trait MonadBracket[F[_]] extends MonadError[F, Throwable] {
  def bracket[A, B](acquire: F[A])(use: A => F[B])(release: (A, Either[Throwable, B]) => F[Unit]): F[B]

  def never[A]: F[A]
}

trait MonadFork[F[_]] extends MonadBracket[F] {
  def fork[A](fa: F[A]): F[Forked[F, A]]

  def raceWith[A, B, C](l: F[A], r: F[B])(
    finish: Either[(A, Forked[F, B]), (B, Forked[F, A])] => F[C]): F[C]
}

These are very small and flexible type classes, while providing just enough power to construct correct, composable, and leak-free software. All methods have low-cost implementations which may not have the full capabilities of more extensive implementations but which can lessen author burden.

MonadBracket must be a super class of Sync. That is to say, it does not make any sense to have a Sync without the ability to bracket (bracket gives meaning to the notion of monadic operations on foreign effectful code). Separately, I'd also argue that Async and Sync should be unified because there is nothing intrinsically useful which is a Sync and not Async.

MonadFork is necessary for safe, leak-free concurrency. That is, any F[_] which does not have a MonadFork should not be used for concurrency. In no case should concurrency be implemented on top of an F[_] that does not support MonadFork because it will be broken by construction.

Of course, a concurrent F[_] could support more than just MonadFork, but MonadFork provides the bare essentials necessary to implement higher-level, composable, safe combinators on top (parMap2, concurrently, etc.).

Concurrent libraries like FS2 and http4s must be able to rely on existence of MonadFork. Simpler libraries that do not have concurrent needs do not have to use MonadFork and will therefore benefit from much simpler IO implementations.

@alexandru

Scalaz 8 IO fully linearizes interruption / finalization. Finalization will never occur out of order or concurrently, but rather, it will be done in the correct order and fully sequentially, post-successful interruption, after user-defined effects have returned control to the runtime. This ensures implementation details are not leaked (the logical model of a linear fiber is maintained) and provides a simple reasoning model that makes it easy to write correct code.

alexandru commented 6 years ago

@jdegoes I feel that we aren't communicating well, I don't understand why, maybe we are not using the same language. I have some gaps in my education, I'm actually trying to finish college right now (12 years later); but to me "the logical model of a linear fiber is maintained" sounds like technobabble.

You have to give me some credit though, because the Monix Task was born 2 years ago and it has a very similar cancellation and evaluation model, so if we are to collaborate, which we should because we can do awesome things apparently, we need to pay more attention to each other 😀

Scalaz 8 IO fully linearizes interruption / finalization. Finalization will never occur out of order or concurrently, but rather, it will be done in the correct order and fully sequentially, post-successful interruption

You can drive several trucks through that statement, because it's carefully worded to ignore the elephant in the room that I mentioned in my samples above.

provides a simple reasoning model that makes it easy to write correct code.

Not true:

Exhibit A: the synchronous signature of Canceler which does not admit an async acknowledgement, hence ordering cannot be preserved
Exhibit B: an issue on the reactive streams spec that confirms your canceler signature is actually the best you can do
Exhibit C: Scalaz 8 IO code that creates a race condition on a file handle by using bracket and cancellation, which couldn't happen without cancellation — https://gist.github.com/alexandru/f30b0c8b3920e7d8a8a6ecf018c0aaec

That Scalaz 8 IO code is actually behaving more or less like I expected, since I've lost nights over this for some time now — I did not even have to run your code to see it, because it's all in its signatures, but there:

Started!
Thrown! java.io.IOException: Stream closed

(run-main-0) java.lang.RuntimeException: Boo
java.lang.RuntimeException: Boo
    at scalaz.effect.Sample$.$anonfun$run$5(Playground.scala:15)
    at scalaz.effect.RTS$.nextInstr(RTS.scala:143)
    at scalaz.effect.RTS$FiberContext.evaluate(RTS.scala:417)
    at scalaz.effect.RTS$FiberContext.continueWithValue$1(RTS.scala:690)
    at scalaz.effect.RTS$FiberContext.resumeEvaluate(RTS.scala:696)
    at scalaz.effect.RTS$FiberContext.resumeAsync(RTS.scala:729)
    at scalaz.effect.RTS$FiberContext.$anonfun$evaluate$4(RTS.scala:496)
    at scalaz.effect.RTS$FiberContext.$anonfun$evaluate$4$adapted(RTS.scala:496)
    at scalaz.effect.RTS$FiberContext.$anonfun$evaluate$19(RTS.scala:603)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
    at scalaz.effect.RTS$$anon$1.run(RTS.scala:95)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)

Can you see the IOException: Stream closed?

That's data corruption right there and no, it's not simple, it's not intuitive, I can argue backed by the actual experience of having users to support that this behavior right here is precisely what users do not expect 😉

At the same time, this is more or less the best we can do (minus some design decisions of yours that I don't like), but we need to call a spade a spade.

MonadBracket must be a super class of Sync. That is to say, it does not make any sense to have a Sync without the ability to bracket

I agree.

This is nice, but it leaks your implementation details, for which I have reasons to disagree:

trait Forked[F[_], A] {
  def interrupt(t: Throwable): F[Unit]
  def join: F[A]
}

Here's Monix's Task as of 3.0.0-M3:

def cancel[A](fa: Task[A]): Task[Unit]

// Yes, this is our join
def flatten[A](fa: Task[Task[A]]): Task[A]

Some problems:

this Forked interface is OOP and would need to be inherited, being incompatible with the type classes that we are trying to promote and this is relevant because in usage this leads to loss of fidelity in the returned types; and if we introduce it as a parameter, it's not feasible to pass it around in addition to IO
I disagree with passing a Throwable to kill a task, for reasons that I can't get into right now — it's enough to say that I believe a cancelled task should be non-terminating

For bracket this is insufficient:

trait MonadBracket[F[_]] extends MonadError[F, Throwable] {
  def bracket[A, B](acquire: F[A])(use: A => F[B])(release: (A, Either[Throwable, B]) => F[Unit]): F[B]
}

I already explained above why, we need to make a difference between interruption and normal finalization and even your own code confirms it.

For MonadKill this also leaks your implementation details:

trait MonadFork[F[_]] extends MonadBracket[F] {
  def fork[A](fa: F[A]): F[Forked[F, A]]

  def raceWith[A, B, C](l: F[A], r: F[B])(
    finish: Either[(A, Forked[F, B]), (B, Forked[F, A])] => F[C]): F[C]
}

Compare with Monix's Task as of 3.0.0-M3:

def start[A](fa: Task[A]): Task[Task[A]]

def racePair[A, B](fa: Task[A], fb: Task[B]): Either[(A, Task[B]), (Task[A], B)]

typelevel / cats-effect

Consider a Type class for resource safety #88