Experiment with Loom - Githubissues

RaasAhsan commented 3 years ago

I'm not really sure how feasible it is to run Scala on the Loom Early Access JDK builds, but we should start thinking about what IO would look like in that world. Some questions I'm interested in:

How necessary is Async going to be?
How will Loom virtual threads interact with different forms of tracing? i.e. stack tracing, thread tracing, fiber dumps
Will a Fiber correspond to a Loom virtual thread?
Will we need to support both native platform threads and virtual threads? i.e. should IO work seamlessly across different kinds of schedulers

djspiewak commented 3 years ago

I've put quite a bit of thought into this! I'll try to brain dump here when I can. I think this would be a great thing to get started on. It's in a similar vein to the fiber-aware work-stealing scheduler concepts, since a Loom scheduler would be Thread-aware. It also has the potential to make blocking things much less of a hassle for users, though we can't make them go away entirely since native blocking exists.

Anyway, more thoughts when I have time.

djspiewak commented 3 years ago

First as a table-setting, let's be clear about what Loom is and what it currently offers.

Loom converts java.lang.Thread into an abstraction (well, more of an abstraction). Whereas at present, a Thread corresponds to exactly one underlying pthread, which itself is a real kernel thread, Loom divorces Thread from pthreads, so that a running Thread will have an underlying pthread, but that underlying pthread is not guaranteed to be stable across the lifetime, and many other Threads may also share that pthread if the first one yields.

That's a long-winded way of saying that Loom converts Threads into Fibers backed by an Executor. Optionally. You need to create Threads using a slightly different mechanism in order to get these benefits. But beyond that it's just normal java.lang.Thread.

At any rate, this has some subtle impacts on existing APIs. Most notably, Thread#yield actually does the right thing (it's basically equivalent to a cede) and Unsafe#park (most commonly accessed via other APIs like Object#wait or Thread.sleep) will deschedule the Thread rather than hard-blocking the kernel thread. This descheduling is represented explicitly to the Executor, who receives the continuation as an object which they can stick back into a task queue.

Terminology: real kernel threads are referred to as "carrier threads", while fiber-like threads are referred to as "virtual threads". We sometimes use the "carrier thread" terminology in CE's implementation as well.

Since carrier threads are never blocked by Unsafe#park, it's tempting to say that this entirely removes the need for explicit asynchrony and CPS. There are some very important practical caveats though. Running down the list:

Native blocking code will block the carrier thread. This limitation is probably fundamental, given Loom's approach, and it's a doozy because a lot of things surprisingly(!) block in native code. new URL is probably the most famous example of this, since it does DNS lookups using the operating system's DNS client, all of which are synchronous.
File IO will usually block the carrier thread. I say "usually" because in theory Linux systems with uring support should be able to get non-blocking file IO, as will Windows systems on NTFS, but it's going to vary a lot and it requires some reimplementation of the current java.io primitives. At the very least, behavior will be very platform-specific and unreliable, not words you want to hear in relation to your thread pools.
It provides no mechanisms for transforming existing callback interfaces. Tons of code out there is already written in a CPS style, and that really isn't going away. While Loom does enable new implementation strategies for handling this scenario (e.g. CountDownLatch is effectively just an async suspend with Loom), it doesn't directly help you since you don't have access to the underlying continuation.
Virtual threads provide no low-level instrumentation. There was some hope that Loom would provide access to the underlying JVM machinery which maintains Thread identity across continuations and carrier threads, which could in turn make it easier to implement things like tracing without high wizardry. This does not appear to have come to fruition, which is disappointing. Virtual Threads do maintain ThreadLocals and other Thread-affine attributes, but that's basically just in service of legacy Java APIs. There does not appear to be very much present which can help higher level runtimes like IO.
Interruption works exactly as badly as it does today. Nothing more really needs to be said. Thread#interrupt is a trainwreck and Loom doesn't fix it, though it does localize the interrupt bit to the virtual rather than the carrier thread.

In some sense, you can actually just see virtual threads as being a fancy way of interacting with an Executor, and you wouldn't be wrong. The only thing that is added above and beyond fancy ways of interacting with an Executor is the magic support for Unsafe#park, which unlocks two potential benefits for Cats Effect:

We may be able to make most blocking things easier for users. Right now, users must always be careful to use blocking or interruptible so as to get blocking calls off of the compute pool. With Loom, this overhead (cognative and runtime) can be avoided… sometimes. And this is the dangerous thing: we can't just tell users "don't worry about blocking things", because some blocking things will still block. An interesting experiment would be if Loom gives us enough machinery to reasonably detect a real blockage of the carrier thread, at which point we could detach it from the compute pool, move it transparently to the blocking pool, and allocate a new carrier. That's still more expensive than blocking though.
We may be able to more efficiently implement IO.Async. This is where the real fun is, I suspect. The implementation of async could, in theory, just reduce to a var and an Object#wait call. Literally block until notified, then read from the var (which would be of type Either[Throwable, A]). The blockage will unattach the virtual thread, and we can just reattach it and wake up the wait when the callback is invoked. This avoids extra shifting, keeps pool affinity, and could potentially be much simpler overall.

There's also a third potential benefit if some of the low level APIs are more accessible than I think they are, which is that we could get better metrics and introspection over fiber evaluation, which could in turn give us better mechanisms for tracing and other goodies for users.

Overall, I think all of this is worth experimenting with. If we can get useful implementations out of it, gated by a static version check, then we could potentially commit them into the series/3.x branch and spin up a matrix build that uses one of the Loom-enabled OpenJDK builds. That would be pretty cool.

djspiewak commented 1 year ago

For posterity, @vasilmkd has done some experimentation on this branch: https://github.com/vasilmkd/cats-effect/tree/loom

vasilmkd commented 1 year ago

To avoid setting high expectations, the branch that Daniel has linked to above is a very stupid implementation where every "schedule a fiber on a thread pool" operation (including resuming from async points, not just when starting new fibers) is executed on a new virtual thread, which has huge overhead.

There is no 1:1 mapping between fibers and virtual threads on that branch, like a real Loom implementation would.

yuriy-yarosh commented 11 months ago

@vasilmkd spring folks figured it out, maybe it worth back-porting something similar ?

armanbilge commented 11 months ago

Cross-linking Daniel's Reddit post which goes into detail why Loom doesn't matter so much for Cats Effect.

https://www.reddit.com/r/scala/comments/sa927v/comment/htsoydn/

yuriy-yarosh commented 11 months ago

Thank you @armanbilge, you're very helpful, as always...

armanbilge commented 6 months ago

Cross-linking this Loom-related PR for anyone following along here :)

https://github.com/typelevel/cats-effect/pull/3870

Honestly I feel inclined to close this issue. At the moment I personally cannot foresee any other Loom-related changes, but of course we are always open to experiments!

typelevel / cats-effect

Experiment with Loom #1057