typelevel / cats

Lightweight, modular, and extensible library for functional programming.
https://typelevel.org/cats/
Other
5.25k stars 1.21k forks source link

A cats asynchronous monad (eg Task) #32

Closed bryce-anderson closed 7 years ago

bryce-anderson commented 9 years ago

As discussed in #21, an asynchronous monad like the scalaz Task would be highly desirable for a variety of reasons including any possible streaming solution. The scalaz Task is very nice in many ways, but @tpolecat and I think it could possibly be improved. This would seem to not be a high priority at the moment, but would be nice to get on the radar.

One such issue with scalaz Task is stack overflows using Task.async.

stephenjudkins commented 9 years ago

One issue that should be considered is whether error handling should be conflated with the other aspects of asynchronous computation. The Future implementations in Scala stdlib and Finagle both do this, while Scalaz provides a Task that provides error handling over a Future that doesn't. The Scalaz approach has the obvious appeal of composition of smaller, independent things. On the other hand, I've never seen Scalaz Future used anywhere by anyone, except in the implementation of Task itself. Further, Task is specialized on Future and isn't used anywhere else. So it's worth asking whether it's worthwhile to provide a public Future that doesn't do any error handling.

Alternatively, to standardize nomenclature between cats and other projects, we could call non-error-handling Future NonErrorCatchingFuture or something and call the error handling layer Future instead of Task. I don't know if that's a good idea or what people think of it, but I'm just throwing things out there.

tpolecat commented 9 years ago

@stephenjudkins Yes, I think effect capture and exceptions tend to go hand in hand. I have been thinking about this a lot and I am close to recommending the following typeclass as an abstraction for all effect-capturing data types. This conflates scalaz.Catchable and the yet-unmerged scalaz.effect.Capture:

trait Effect[F[_]] {
  def capture[A](a: => A): F[A]               // primitive effect capture
  def unsafeRun[A](fa: F[A]): A               // unsafe extract
  def fail[A](t: Throwable): F[A]             // exceptiom propagation
  def attempt[A](fa: F[A]): F[Throwable \/ A] // exception trap
}

Still not totally convinced that effect capture and exception management need to be the same type, but so far in my experience they always appear together.

mandubian commented 9 years ago

And what would be the difference between a Task and a potential nice IO Monad as the one proposed by Runar?

tpolecat commented 9 years ago

@mandubian none. I think Rúnar's outline is what we want (i.e., an IO monad that understands asynchrony).

mandubian commented 9 years ago

So this is just to find what's not so good about Task :) I must say I've never really understood the need of underlying Future for Task...

aloiscochard commented 9 years ago

You might be interested in that experiment: https://github.com/aloiscochard/matterhorn

Not sure how much that differ from what Runar proposed, my approach is surely more specific.

Anyway, there is some outgoing effort in that domain among scalaz folk, it seems that it would be nice to federate effort to create a sort of "Scalaz RTS".

If you guys are interested about that, please get in touch with me or @mpilquist.

tpolecat commented 9 years ago

Yep, it's worth a look. There is still something in matterhorn that is causing nontermination when I use it with scalaz-stream in doobie, so there's something tricky going on that I haven't figured out yet.

aloiscochard commented 9 years ago

I have done some more experiment here: https://github.com/scalaz/scalaz/tree/series/8.0.x

Trying to split the IO spec from the impl. of matterhorn, the idea is that the scalaz-io model should be independent (no dependencies on other scalaz modules) and provide a common interface for runtime system implementation.

I suppose that would imply splitting the typeclass instances from the data type in the IO module, I might not do that at first but it's surely doable.

@tpolecat yeah, I'll look into that issue now that thanks to your help I can reproduce it

mandubian commented 9 years ago

@aloiscochard as you have an AST for all Thunk types (Map, Apply, FlatMap), does it mean we could imagine some natural transformation optimizing the Thunk sequence such as Map Fusion?

(This is something I wanted to experiment in a custom implementation of IO with Free)

aloiscochard commented 9 years ago

@mandubian yes definitely!

I would love to play with that, but it's really optimization so for now I have not spend time on it.

The thing is that the interpreter is quite fast, so I'm not sure we would get much benefit from fusing Map, but there is probably other neat optimizations that I'm not aware of.

It would be interesting though to do some benchmark to see the gain of Map fusion, the natural transformation should be relatively trivial to implement and there is benchmark already available that could exercise it... so if you are interested, please give it a go! :-)

mandubian commented 9 years ago

That's really cool! Exactly what I wanted to implement :) I'm going to dive into your code and play a bit :)

On Wed, Feb 4, 2015 at 5:33 PM, Alois Cochard notifications@github.com wrote:

@mandubian https://github.com/mandubian yes definitely!

I would love to play with that, but it's really optimization so for now I have not spend time on it.

The thing is that the interpreter is quite fast, so I'm not sure we would get much benefit from fusing Map, but there is probably other neat optimizations that I'm not aware of.

It would be interesting though to do some benchmark to see the gain of Map fusion, the natural transformation should be relatively trivial to implement and there is benchmark already available that could exercise it... so if you are interested, please give it a go! :-)

— Reply to this email directly or view it on GitHub https://github.com/non/cats/issues/32#issuecomment-72887556.

adelbertc commented 9 years ago

Anyone working on this/what's the latest on this? I think having an IO/Task-y thing (to keep everything pure) in Cats is the last piece I would want to using it in my projects.

As far as I know there's been talk of possibly seeing if scalaz-stream uses it's own Task-y thing and then perhaps just defining Cats type class instances for that? Seems a bit weird, very possible I mis-remembered/mis-interpreted.

lukiano commented 9 years ago

Hi, I've been working on a typeclass for asynchronous operations. Basically a way to extract some methods from Future and Task into a generic F. Right now it lives in https://bitbucket.org/lleggieri/monadasync/, and there's a cats branch. Since cats doesn't provide a Future, I made one that uses cats' Free implementation. Hopefully you find it useful!

non commented 9 years ago

@lukiano Thanks! I'm sorry there hasn't been a ton of activity on this ticket recently. Currently, I think our plan is to try to create a concrete data type that works in JS and JVM, but if we need to fall back to a type class we'll want to use an approach very similar to yours I think.

non commented 9 years ago

As a more general update about the status here:

  1. @stew and I started working on a minimal port of a Scalaz-like Task, which currently lives in feature/task.
  2. As @InTheNow and others have pointed out, a strategy based around threads and blocking will never work for the JS platform. So we can't simply do a port of Scalaz' Task. Thus, feature/task (in its current state) is not eligible to be merged.
  3. The current hope is that we can create a concrete data type in Cats which supports asynchronous operations, and then also support a blocking interface for JVM via extension methods in the cats.jvm package.
  4. Failing that, we may need to provide an abstract interface in Cats and concrete implementations in cats.js and cats.jvm. This could also take the shape of a type class (possibly similar to @lukiano's MonadAsync).
  5. We want to do this in a way that is both efficient for the JVM, but also platform-independent and sensible for JS. This may require us to build (or import) a bit of machinery from other projects. @InTheNow has suggested using using some of @alexandru's Monifu library (https://github.com/monifu/monifu) to do this.

Anyway, that's the current status of things. There has been a lot of conversation on Gitter about task, but I just wanted to document where I see this issue right now. I think using this issue to brainstorm, ask questions, and help try to flesh out the requirements of a task in Cats would definitely be helpful.

ghost commented 9 years ago

A similar issue was Add future{Comonad, Eq, Order, PartialOrder} to js, where the JVM version blocked with Await. Hence I fully agree with @non's summary (that I've also summarised)

it does seem like we need a future/promise/scheduler abstraction for Cats (if only to be clear on what reasonable platform-independent semantics are possible) and i think putting those in catalysts makes a lot of sense.

if we can write platform-independent code that is correct and adequate then i'd rather leave that in cats if possible.

FTR, there is a also an open SLIP proposal related to this and I see no reason why any work done here could not be incorporated.

ceedubs commented 8 years ago

I'm beginning to think that maybe Task isn't something that should live in Cats. At least not something that should block a 1.0 release.

Concurrency is pretty tough. I know @stew ran into quite a bit of trouble as he was working on implementing a version for Cats. Even though scalaz's Task has been around for years, it still has issues (see here for an issue with stack-safety that I recently ran into).

FS2, the in-progress rewrite of scalaz-stream (which seems to be advancing pretty rapidly these days) has its own Task that I'm thinking may be a good one to encourage using. I think this sort of thing needs to live in a dedicated concurrency library, and I don't think that's really what Cats intends to be.

I think there is some really low-hanging fruit in making it easy to use Eval as an IO (and I think @non even mentioned that he has some of this in a branch somewhere). That seems like a reasonable thing to move forward on.

I think that if Cats were to gain a Task in the future, that it should live in a separate module (not core). And if that's the case, then this could always be added in a completely compatible way after 1.0, so I don't think it would really be a blocker for 1.0.

adelbertc commented 8 years ago

So my one concern about that FS2 is a completely separate library from Cats, and I've had a couple uses of the Task monad without doing anything related to scalaz-stream - I reach for it whenever I want something s.c.Future but don't want to use s.c.Future, and pulling in the entirety of FS2 just for Task seems a bit wonky.

That being said, Scalaz does have IO[s.c.Future[A]] things, like in https://github.com/scalaz/scalaz/blob/122e5f9dfc3fbe9189d1f817e703c78047fbb3ef/effect/src/main/scala/scalaz/std/effect/Future.scala , though that does leave a lot to be desired.

ceedubs commented 8 years ago

@adelbertc that's a fair point. I started reaching for Task long before I started using scalaz-stream.

A partial solution might be if we can work with the fs2 authors (of which there is some overlap with Cats authors) to try to get Task into its own submodule separate from the stream components.

I'm not completely opposed to a Cats Task. I just don't want us to underestimate the complexity involved or to delay a stable 1.0 release waiting for it.

lukiano commented 8 years ago

May I curiously ask why the need for a Task (both here and in scalaz-stream redesign) instead of focusing on a well implemented Future and leave exception handling for a different layer (i.e. EitherT[Future, Throwable, A]) ? Task could be just a type alias. Stuff like "return the first of these two tasks that successfully completes" or "cancel the task" can still be done I think. I ask for scenarios like a left side that isn't a Throwable but a business-logic error, or adding a Writer layer in between (having a log for both sides).

tpolecat commented 8 years ago

Lifting F[A] to F[Throwable \/ A] is a primitive operation in the implementation of F, so I think they must be considered together.

mandubian commented 8 years ago

If FS2 is developing a Task and cats wants a Task and guys can work together, it would be quite nice to gather efforts and make this Task a submodule of cats... @lukiano we need a Task to be lazy (not like scala Future), to be specifically designed for task scheduling with error + (non-)deterministic aspects + etc... & to be optimised because when dealing with task management, you don't want to pay the price of generic functional structure wrapping, you want it pure outside but optimised inside.

ceedubs commented 8 years ago

@mandubian fs2 has gone to great lengths to avoid both a scalaz and a cats dependency, so I don't think the authors (as a whole) would be very inclined to add a cats dependency for Task. I'd be happy to find out that I'm wrong, though.

mandubian commented 8 years ago

Would it be possible to make it an independent project outside of cats/scalaz scope and be able to plug it on cats in some way?

hakuch commented 8 years ago

I absolutely see value in some of the stuff from scalaz-concurrent like Task and Actor as alternatives to s.c.Future and their ilk. Like @mandubian suggested, perhaps there could be a standalone implementation of these without any type-class instances which cats could then provide.

A problem that occurs to me with this approach is that the interface of Task would be restricted to "lowest-common-denominator" types such as Either and pass-by-name (versus the Eval type in cats).

alexandru commented 8 years ago

Hello. I'm thinking of adding a Task to Monifu and I'm writing a proposal, maybe you find it useful:

https://gist.github.com/alexandru/55a6038c2fe61025d555

alexandru commented 8 years ago

Dropping a line :-) Made edits according to latest changes. Of note is a bigger section of why the Scalaz Task could be improved: https://gist.github.com/alexandru/55a6038c2fe61025d555#so-whats-wrong-with-scalaz-task

TL;DR - the implementation and the concept of the Scalaz Task is very good for what it does, but it leaks implementation details in its API, given its usecase of modeling async stuff that can also trigger side-effects (e.g. I/O). For Javascript @InTheNow was saying that blocking threads doesn't work, which is true. But another problem of the Scalaz Task is one of liveliness. Given that Task is essentially a trampolined computation and assuming that recursive loops are important (hence the stack overflow concerns), in Javascript it's pretty bad if you block the run-loop indefinitely. In the browser that can mean you freeze the UI as well.

As a strategy, when streaming values (like in streams implementations), or when processing stuff (like in loops), it's indeed a good idea to keep things on a single thread (cache locality, etc, etc.), but there are diminishing returns to it and it's better to do processing in batches as to not block any single thread indefinitely. And then between those batches you force an asynchronous boundary. And Scalaz Task does allow you to do this, but manually and that's not cool. Also note Scala's standard Future, as inefficient as it is in its reliance on a thread-pool, doesn't bring liveliness issues on top of Javascript.

mandubian commented 8 years ago

nice review, is your current impl in Monifu following those ideas?

alexandru commented 8 years ago

@mandubian yes it does (you can see the current state here). This actually started as an experiment for https://github.com/typelevel/general/issues/4 to show how you can have non-FP and well encapsulated internals while exposing a good API :-) though it might not be in the spirit of Cats.

mandubian commented 8 years ago

Nice, I need to take some time to dig in all of that... I'm specially interested in it because i'm working with Daniel Spiewak on his Emm "monadic stack lib" for cats and we still depend on Scalaz Task for demo...

alexandru commented 8 years ago

I've added another API concern that I've fixed in my implementation: the issue of canceling expensive tasks. See: https://gist.github.com/alexandru/55a6038c2fe61025d555#5-unreliable-for-canceling-running-tasks

Basically my implementation now has this signature for runAsync and for create (which is roughly like Task.async in Scalaz):

sealed abstract class Task[+T] {
  def runAsync(f: Try[T] => Unit)(implicit s: Scheduler): Cancelable
  def runAsync(implicit s: Scheduler): CancelableFuture[T]
}

object Task {
  def create[T](register: (Try[T] => Unit) => Cancelable): Task[T] 
}
//...
val result = task.runAsync
// nope, changed my mind
result.cancel()
wedens commented 7 years ago

Shouldn't it be closed as there is cats-effect now?