Open swapdonkey opened 7 months ago
It seems there are multiple things going on here. The essential thing is that when a fatal exception is detected (which InterruptedException
is), we're shutting down the whole world (meaning all the IORuntime
s in the current JVM).
There are incorrect things I see in the behavior of CE with the code above:
IORuntime
doesn't seem to shut down cleanly.IORuntime
created, which also doesn't shut down cleanly.These should be fixed, but fixing these won't make the code example "work", at least I don't think so: the idea is, that if there is a fatal exception, the most we can hope for is a clean shutdown.
What would you consider "correct behavior" for this code example?
So the tricky thing here is that InterruptedException
, like all fatal errors, fully torpedoes the runtime. By the time you get to the second iteration, the threads are all shut down and gone, so it's impossible to use global
to execute anything. The design expectation is that you're not going to try to execute anything after this point, and you'll basically just shut down the VM.
Of course, that's not what you want here since you're trying to run new things after we torpedo the runtime. The answer is probably for you to not use global
and instead make your own IORuntime
(probably using the builder). Whenever you catch and recover from a fatal error, build a new runtime for yourself and discard the old one.
So the tricky thing here is that
InterruptedException
, like all fatal errors, fully torpedoes the runtime. By the time you get to the second iteration, the threads are all shut down and gone, so it's impossible to useglobal
to execute anything
Hmm, but maybe we can fix this. We already made some changes so that when the global runtime shuts down, it removes itself so that a new one may be installed.
Not entirely sure why that's not working in this instance.
Not entirely sure why that's not working in this instance.
It's working. The second iteration of the while loop executes the IO on a different IORuntime. But then the fatal failure handling code doesn't entirely run, because globalFatalFailureHandled
is already true.
As per @durban comment creating the runtime on every iteration still causes the second iteration to not throw an exception outside of the IO.
while(true){
val (compute,_) = IORuntime.createWorkStealingComputeThreadPool()
val (blocking,_) = IORuntime.createDefaultBlockingExecutionContext()
val (scheduler,_) = IORuntime.createDefaultScheduler()
val ioRuntime = IORuntime.builder().set compute(compute, ()=>()).setBlocking(blocking, ()=>()).setScheduler(scheduler, ()=>()).build
val f = IO{
throw new InterruptedException("Test")
}.attempt.unsafeToFuture()(ioRuntime)
try{
Await.result(f, Duration.Inf)
} catch {
case t:ExecutionException => println(s"Underlyng exception ${t.getCause}")
}
}
To give some context to the implementation we are migrating from cats effect 2 to 3. Our current implementation runs the IO in separate threads which are managed via JMX so on an interrupted exception from the IO is caught in the thread, the interrupt reset and then continues. Though not an ideal mechanism I was hoping with cats effect 3 we would just be able to keep the same handling to minimise the changes. So a separate IORuntime for each thread. When interrupt is thrown it's caught in the main thread, resets interrupt, recreates IORuntime and continues.
There are incorrect things I see in the behavior of CE with the code above:
- The IORuntime doesn't seem to shut down cleanly.
This seems to be a bug, see #4066 and #4067.
The code below catches a ExecutionException which wraps the Interrupted exception on the first iteration. On the second iteration no ExecutionException is thrown and the main thread blocks on the Await.
I think the issue is in
IOFiber.onFatalFailure
specifically the lineif (IORuntime.globalFatalFailureHandled.compareAndSet(false, true))
On the first iterationglobalFatalFailureHandled
is initialised to false on the second iteration its now true so the code block is never executed and the exception isn't raised.If that is the case would adding a function like
resetFatalFailureHandled
be added to IORuntime to reset it if the future returned fromunsafeToFuture
has its ExecutionException handled and retried in a loop?