Closed alexandru closed 7 months ago
I agree with your main statement. However my understanding is that "direct" Scala will move away from using monad types for concurrency.
Hey, thanks for the very detailed comment! It is true that async/await-style future is error prone for the reasons above, but we are not directly trying to build the same way. In fact, currently gears
is just trying out possible abstractions for asynchronous computations, of which Future
-style async/await is just something higher on the ladder, as one could say.
Our current "base" level asynchronous/suspendable function right now has the following form
def fetchUser(using Async): User = ???
def fetchOrder(using Async): Order = ???
def handle(using Async): Response = {
val user = findUser()
val order = findOrder()
Response(user, order)
}
where Async
is an asynchronous context; be it a virtual thread, continuation, ... is an implementation detail.
This context also introduces a scope: currently you can create async contexts out of thin air with Async.blocking
or with Async.group(using parent: Async)
, both requiring all asynchronous computation inside the context to complete before returning (blocking the thread in the Async.blocking
case and suspending the parent context in the case of Async.group
). This is pretty much our approach to structured concurrency.
Here Future.apply
merely doubles both as Async.group
(note that it still takes an implicit Async
context to create a Future) and a one value Async.Source
(i.e. something async computations are allowed to poll for value) that can be polled multiple times. Perhaps it should be made clearer from the example, but you do not have to go through Future
s at all.
Btw, I recently mentioned whether we should encourage using Future
s or not in some of our discussions about gears
, and this would be a very valuable input to consider. Thanks so much again for the detailed write-up!
P/s: pretty much like Kotlin, to get the user and order concurrently one can wrap them in Futures:
def fetchUser(using Async): User = ???
def fetchOrder(using Async): Order = ???
def handle(using Async): Response = {
val user = Future { findUser() }
val order = Future { findOrder() }
Response(user.value, order.value)
}
@natsukagami could you please give further details on Response
and Async
in your example?
Response
is just something that construct a response, I guess. It's not related to async at all.
As for Async
, perhaps the readme outline has explained it in some details:
Async Context: An async context is a capability that allows a computation to suspend while waiting for the result of an async source. This capability is encapsulated in the Async trait. Code that has access to a (usually implicit) parameter of type Async is said to be in an async context. The bodies of futures are in such a context, so they can suspend.
@natsukagami thanks for the response!
I wrote this issue because Future
is emphasized in Martin's presentations, and also in this project's README and implementation. And a sample like this is almost 100% the async/await pattern, except for the nascent cancellation model.
The design is a little different from Kotlin's Coroutines. In Kotlin, if you write this, the compiler will trigger an error, in spite of this being marked with suspend
:
suspend fun handle(): Response {
val user = async { findUser() }
val order = async { fetchOrder() }
return Response(user.await(), order.await())
}
The issue that the async
function isn't available, because it needs a CoroutineScope
. The IDE won't even suggest it, because it's an extension method. This happens, in my sample, with the coroutineScope
utility. To expose the equivalent of a Future
, one might be tempted to do this:
suspend fun foo(): Deferred<Unit> =
coroutineScope {
async {
runInterruptible {
Thread.sleep(2000)
println("Hello!")
}
}
}
fun main(args: Array<String>) =
runBlocking {
val d = foo()
println("Started foo")
d.await()
println("Finished foo")
}
Except that this will always return a completed Deferred. And note how this is a small mistake that most of the time has an impact on performance, not correctness. Thus, the program's output would be:
Hello!
Started foo
Finished foo
Kotlin's equivalent for implicit parameters are the "context receivers". The coroutineScope
has this signature:
suspend fun <R> coroutineScope(block: suspend CoroutineScope.() -> R): R
You can view that CoroutineScope
as an implicit that a function like async
needs to work. Without it, no concurrent execution for you! And it's actually hard to execute tasks in a "fire and forget" fashion (hard, as in, you have to search the web for the answer), as things have to execute in the GlobalScope
, and it's interesting because the first answer I got turns my suspended function into a regular one:
fun foo(): Deferred<Unit> =
GlobalScope.async {
runInterruptible {
Thread.sleep(2000)
println("Hello!")
}
}
Basically, Kotlin makes it hard to leak unfinished Deferred
(aka Future
) results in function signatures. To compare with Gears, if it had similar behavior, these functions would return Future
values that are already complete, unless the implementation tried really hard for that to not happen (in which case the using Async
should be unnecessary):
def fetchUser(using Async): Future[User] = ???
def fetchOrder(using Async): Future[Order] = ???
Speaking of the cancellation model, the current design isn't adequate because it cannot wait for the completion of that cancellable. This can lead to leaked resources. I think this was already mentioned in issue #12, but it's worth repeating it with Kotlin's equivalent:
public interface Job {
public fun cancel(cause: CancellationException? = null)
public fun invokeOnCompletion(handler: CompletionHandler): DisposableHandle
}
In other words, the cancel
method isn't blocking, but it allows you to install a "completion handler" which will signal when cancellation is complete. This is important, because coroutine scopes end when all children are complete, even in the case of cancellation. This function, for example, will always complete in 10 seconds instead of just 1 second:
suspend fun bar() =
withTimeout(1.seconds) {
// This isn't cancellable; to make it cancellable,
// we could wrap it in a runInterruptible block.
withContext(Dispatchers.IO) {
Thread.sleep(10000) // sleep for 10 seconds
}
}
Similarly, this too will also complete in 10 seconds, even though it could complete sooner if we cooperated with Java's interruption protocol by using runInterruptible
:
suspend fun baz() =
coroutineScope {
// Not cancelable
launch(Dispatchers.IO) {
Thread.sleep(10000)
println("Done!")
}
// Not cancelable
launch(Dispatchers.IO) {
Thread.sleep(1000)
throw RuntimeException("Boom!")
}
}
P/s: pretty much like Kotlin, to get the user and order concurrently one can wrap them in Futures:
def fetchUser(using Async): User = ??? def fetchOrder(using Async): Order = ??? def handle(using Async): Response = { val user = Future { findUser() } val order = Future { findOrder() } Response(user.value, order.value) }
I'm really tired to write any Future
thing, I think below is better for an enduser.
refs: Zig's suspend function : https://github.com/ziglang/zig/issues/6025 refs: Dart's Cancellable Future
When typing Future[X]
it will require me to press the Shift + [
and then the Type and then with another Shift + ]
. please make it fluent, without any Future[XYZ]
leaks to user codebase. I really love the go blablabla
.
You can argue it's a bad idea, but Golang have many users just for their go
keyword.
def fetchUser(using Async): User = ???
def fetchOrder(using Async): Order = ???
def handle(using Async): Response = {
async val user:User = findUser() //starting now, triggered by defined
async val order:Order = findUser() //starting now, triggered by defined
Response(user, order)
}
and
def fetchUser(using Async): User = ???
def fetchOrder(using Async): Order = ???
def handle(using Async): Response = {
lazy async val user:User = findUser()
lazy async val order:Order = findUser()
Response(user, order) //starting when polling values, trigger by pulling
}
The (using Async)
should not be there, and the compiler should generate it for me.
@natsukagami Both Zig and Dart are during the Design progress, maybe you can have some design ideas to share.
@alexandru Thanks for your comments! I am of course well aware of the history of futures and async/await, from the origins by Liskov to the implementations in a dozen languages. I am also aware of the notes on structured concurrency you linked to and think the concepts make a lot of sense. Like you, I don't think Scala's current futures should have much of a future, so it's fine if the industry really moves away from them. My argument against scala.concurrent.Future
was mostly an esthetic one: Futures are concurrently running computations yet they are composed monadically as if they are values. That's an awkward and error-prone fit, forced on us because the JVM did not have a cheap native way to suspend a computation without blocking a thread. Your post gives lots of good examples of possible errors this can lead to, so I can now back up my unease about scala.concurrent.Future
much better. I also plan to teach this to a large class later this term, and your counter examples will be really useful for that.
That said, the connection with our current effort in gears is much more nuanced than you state.
First, I should state that gears is very much an explorative project, undertaken by some students and myself. It has nothing to do with Scala 3 and there are no plans to put it in the standard library or something like that. We want to find out how our ideas for a simple and straightforward system dealing with functional concurrency shape up as we venture into this further.
Second, I note that some of of our differences might come from a difference in mindset. Your stated concerns are all about imperative effects. You rightly worry there might be a side effect in an unforced concurrent computation happening at unpredictable times that could be overlooked, and that would be really bad. I agree with the concerns, but I think that the stated problems with old-style futures will not apply to what we explore here (more on that below).
My vision for futures in gears is much more functional in nature. Futures embody the principle of parallel computation. They are the most logical building block of lenient evaluation. We know about strict vs lazy: Strict evaluates once something is defined, whereas lazy evaluates once it is used. Lenient is the logical generalization when you have parallelism. It means a computation can start when it is defined and must finish when it is used. In between these two time points it can be scheduled at will. That's what a future is. And the first use case for a future is indeed some purely functional computation that we want to spawn out. We do not want to use a coroutine or process for that since we still want to observe termination (either normally or by an exception). So we need to sync on a result. This is the essence of dataflow computation (see also Morgan Stanley's Optimus Cirrus platform for a concrete realization of these ideas at scale). That said, one can discuss the details whether we want to explicitly wait on a future's result or whether this should be done implicitly. We are currently opting for explicit as the most straightforward conservative choice.
Third, I believe the comments about unwanted concurrency are addressed in gears though the Asnyc
capability: you can't suspend without it. In fact, gears disentangles async computation from futures. Futures are the primary (in fact, the only) means to start a concurrent computation. But if you just want to compute a result which entails waiting for some other events (such as futures, or channels, or i/o, or timers) you don't want to return a future. @natsukagami already showed how
we would code the Kotlin example you gave in gears.
def fetchUser()(using Async): User = ???
def fetchOrder()(using Async): Order = ???
def handle(using Async): Response = {
val user = findUser()
val order = findOrder()
Response(user, order)
}
No futures here. In fact, the closest analogue to a suspend keyword would be the (using Async)
clause. This states that running fetchUser
needs an Async
context, which will allow it to suspend.
With scala.concurrent.Future
that did not work, since you communicated possible suspensions by returning a Future
result. In gears you use using Async
instead. It's a nice demonstration of the duality of capabilities and effects.
This is also reflected operationally. Scala 2 style async/await emulates stackless coroutines, so on every function call you have to snap back to Future
results. gears requires continuations or full coroutines/fibers, so you can wait for an event from within a deep call stack.
And that also addresses your saveStateInDB
concern. You would not refactor
def saveStateInDB(state: State): Unit
to
def saveStateInDB(state: State): Future[Unit]
If you want to suspend, you would write instead
def saveStateInDB(state: State)(using Async): Unit
Even if you tried to change erroneously to
def saveStateInDB(state: State): Future[Unit]
the type system would most likely catch you out, since you can't implement a Future
without an Async
capability and you are not given one here. Aside: there is a way to circumvent this using Async.blocking
, by creating a Async
capability by blocking a thread, but that's probably going to be heavily discouraged.
Fourth, about structured concurrency: Yes that's built in.
Finally, about function coloring: Capabilities are actually much better here than other language's proposals such as suspend
or async
which feel clunky in comparison. This becomes obvious when you consider higher order functions. Capabilities let us define a single map (with no change in signature compared to now!) that works for sync as well as async function arguments. That's the real breakthrough here, which will make everything work so much smoother. I have talked about this elsewhere and this response is already very long, so I will leave it at that.
In summary: these are not your old style scala.concurrent.Future
s by any means. If that confusion persists, we might want to choose a different name for them in the end. On the other hand, gears futures are very much in the spirit of the original future designs by Liskov et al. more than 40 years ago. So it seems a shame to give up on the name just because the concept was perverted for a while by the need to flatMap
futures for lack of alternatives.
The link for 《direct style scala》https://youtu.be/0Fm0y4K4YO8?si=O4F7Qs2bERdgplAQ
I am very curious what you think of rust's model, especially how tokio uses it, and how it compares to your suggestions.
Futures are a type in rust that encapsulate executable code that is launched then polled for status by a scheduler. They can be passed around and stored, and until something that executes them like tokio::spawn
or in an async
function calls .await
. This presents other limitations (async closures w/ variables are not a stable rust feature yet, for example, and tokio::spawn
can be hard to tease errors out of, similarly to goroutines).
From my reading of your discussion I would suggest that yes, rust's use of async adds function coloring, but concurrent execution doesn't happen at all, and rust's borrow checker and thread-safe memory guarantees eliminate number 3 without it even being a core focus (in fact, not being able to share memory easily between async routines is a humongous bugbear in rust). The function coloring is annoying but with rare exceptions (closures, traits and recursion being big ones) pose limitations that need to be overcome; nothing is lost.
I'd encourage you to read about it, as it might give you different perspective on how this may be tackled.
(see also Morgan Stanley's Optimus Cirrus platform for a concrete realization of these ideas at scale)
For reference: https://github.com/morganstanley/optimus-cirrus; for technical details https://skillsmatter.com/skillscasts/13108-monad-i-love-you-now-get-out-of-my-type-system is probably the best entry point.
Aka context receivers. Don't people find code like this weird?
(block: suspend CoroutineScope.() -> R)
I did read an intro to context receivers but code like this just baffles me. Why should a concept as simple as an implicitly passed parameter be mangled up like this? I would assume it's the same as
(CoroutineScope ?=> R)
which has a clear semantics (given in our Simplicitly paper). Why not do that? What's the point in warping the meaning of this
to mean "some implicitly passed argument"? But maybe that's just me. Do others find this natural?
@erikh It looks like tasks are closest things in Gears that correspond to Rust futures. A task is simply a lambda that takes an Async context and returns a Future. It can start running only when instantiated with an actual async context.
Yes that sounds very similar. In rust's case, tokio more or less ends up providing the context externally, and rust provides the language "stapling" for it to function by way of std::future::Future's trait definition.
------- Original Message ------- On Monday, September 25th, 2023 at 8:00 AM, odersky @.***> wrote:
@.***(https://github.com/erikh) It looks like tasks are closest things in Gears that correspond to Rust futures. A task is simply a lambda that takes an Async context and returns a Future. It can start running only when instantiated with an actual async context.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Aka context receivers. Don't people find code like this weird?
(block: suspend CoroutineScope.() -> R)
I did read an intro to context receivers but code like this just baffles me. Why should a concept as simple as an implicitly passed parameter be mangled up like this? I would assume it's the same as
(CoroutineScope ?=> R)
which has a clear semantics (given in our Simplicitly paper). Why not do that? What's the point in warping the meaning of
this
to mean "some implicitly passed argument"? But maybe that's just me. Do others find this natural?
Because you can then call the methods defined on CoroutineScope inside the lamda, with scala you have to import the methods into scope.
Just put my two cents here: with JDk 21,I think the async and sync is blurry. And in future releases, the virtual thread's carrier thread may be ad hoc user defined.
So what I would like to see is gears as a high performance async runtime, with that, I expect no Future or Async both in method parameter list and return type signature.
With the Caparese I think this can be done.
Just put my two cents here: with JDk 21,I think the async and sync is blurry. And in future releases, the virtual thread's carrier thread may be ad hoc user defined.
quoting @deanwampler here:
Scala now runs on three platforms: JVM, JavaScript, and native, so even on the JVM, using the new lightweight fibers coming in Project Loom would only be available to users on the most recent JVMs.
I'm unsure if anybody mentioned this yet, and I don't know where to open this discussion, but
Future
and async/await aren't a good programming model, and I hope that, for whatever new effect system gets developed for Scala, this model gets left behind.I'm adapting here the sample based on this project's README:
Equivalent TypeScript/JavaScript code:
This model is the async/await pattern. It was implemented most prominently in C# and TypeScript/JavaScript. And it's error-prone, despite its popularity. The reasons are:
The problem is that function calls require the
await
keyword (or.value
in the Scala sample). Its absence leads to concurrent execution, and this is counterintuitive, always. In the sample above, the 2 methods could be using the same database connection, and that connection may not be safe for concurrent use.To make matters worse, "coloring" in this case isn't very type-safe if we don't care about the returned value. For example, consider that we have this function doing blocking I/O:
And we change it to an async one, changing its color:
In our code, the compiler will not warn us at the call-site when this change happens. At best, it may warn us that an unused
Future[Unit]
value gets ignored, but that's an unreliable warning in my experience. So code like this actually changes behavior in a dangerous way, as the call-site becomes "fire and forget":Kotlin's Coroutines
Compare with Kotlin's Coroutines, which should be the current benchmark for UX, instead of the old async/await:
Note, there is no
await
keyword. That code looks just like blocking I/O code from Java. Except for thesuspend
keyword, which introduces "function coloring", except that this is actually useful, because going from this:To this:
Will make this code break if it's not in a coroutine scope, or otherwise the behavior of the program doesn't change. In other words, the compiler goes to greater lengths to ensure that the "color" matches, in a way that you can't do it with plain types:
And executing stuff asynchronously requires extra effort:
This function does more than meets the eye, because Kotlin also introduced "structured concurrency". In this case, we have 2 async tasks, and to ensure that no resources get leaked:
handle()
function cannot return with a result or an exception until all async children complete, even if we don'tawait
their result (e.g., usinglaunch
instead ofasync
from Kotlin's library).For a good introduction, see: Roman Elizarov — Structured concurrency (YouTube).
Java's Virtual Threads
Kotlin's Coroutines are very similar in spirit to Java's Project Loom (JEP 425).
Java prefered to return to blocking I/O, because it's a language that was built for blocking I/O (e.g., checked exceptions, try-with-resources, etc.). And in JEP 428: Structured Concurrency they are even introducing utilities for the "structured concurrency" approach (with more boilerplate, but does the job):
Again, notice how
findUser()
andfetchOrder()
are blocking functions, and if we want to execute them in parallel, we need extra effort. And while aFuture
data type is used here, that's only for awaiting on it in the same function, otherwise its use is completely encapsulated in the implementation. And if one of the tasks fails, theStructuredTaskScope.ShutdownOnFailure
ensures that the other task gets cancelled, and that thehandle()
function doesn't return until all tasks finish or get cancelled.Scala's Cats-Effect
Even in Scala, we actually have all the benefits from above via our current monadic effect systems, in addition to tracking side-effects:
No concurrent execution here, no threads got hurt in the process. This separation between expressions and their interpretation (referential transparency) is the main benefit of monadic effects. Note that you can get some of those benefits back, in normal Java code, by just working with blocking I/O, because here the real problem has been
Future
all along, and not necessarily imperative programming 😉If we wanted concurrent execution, the direct translation is error-prone, due to
start
being a "primitive":This is error-prone because the parent fiber doesn't wait for its children. I hear that ZIO does that, but haven't checked. This means that if one of the fibers ends in error, the other doesn't get cancelled automatically. But
start
is known to be low-level, so one might specify parallelism explicitly:Or, a closer translation to the "structured concurrency" approach is to use Resource and background. This ensures that if one fiber fails, the other gets cancelled:
Others
A popular blog post that advocates for "structured concurrency" is: Notes on structured concurrency, or: Go statement considered harmful.
What's interesting about it is that it attacks the very existence of a
go
statement in the Go programming language:This starts a "goroutine", meaning that the function gets executed asynchronously. For our purposes, this is like
Future.apply
in Scala,scope.fork
in Java's JEP 428, orlaunch
in Kotlin's coroutines. It's considered "harmful" because it's "fire and forget", which leads to leaked resources and complexity, the author comparing it to the GOTO statement.Interestingly, the Go programming language does not have an
await
keyword, even if it does M:N scheduling. Alongside Kotlin, it's one of the first languages to go that route.TLDR — I hope that Scala does not implement
async/await
and moves away fromFuture
, because the industry is moving away from it.