Open ViralBShah opened 4 years ago
Ctrl-C doesn't properly work in single threaded code either ;)
Works better, I think.
Seems like structured concurrency would help here, although whenever there's a @sync
(explicit or implicit) it should be possible to make this work to the extent that threads can be interrupted successfully (so not 100%, but somewhat).
We just really need to stop having Ctrl-C throw regular exceptions. It's extremely surprising that everything can suddenly also throw interrupt exceptions (not to mention it not being modeled in the compiler).
Actually, with 1.4 (maybe even 1.3?) I do notice killing single threaded Julia processes is cumbersome too with ctrl-c. @timholy 's explanation was helpful to understand.
I that ctrl-c is less well-behaved than pre-1.3 even for single threaded code, in 1.4. You have to keep it pressed for a while, and you get the big Julia stacktrace.
I mentioned it in the other issue https://github.com/JuliaLang/julia/issues/25790#issuecomment-623163972 but it'd be nice to solve this with structured concurrency #33248.
When Ctrl-C'ing multi-threaded code, it crashes Julia altogether.
julia> function fib(n::Int) if n < 2 return n end t = Threads.@spawn fib(n - 2) return fib(n - 1) + fetch(t) end^C julia> fib(50) ^C^C^C^C^Cfatal: error thrown and no exception handler available. InterruptException() sigatomic_end at ./c.jl:425 [inlined] task_done_hook at ./task.jl:442 _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2144 [inlined] jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2322 jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1692 [inlined] jl_finish_task at /buildworker/worker/package_linux64/build/src/task.c:198 start_task at /buildworker/worker/package_linux64/build/src/task.c:697 unknown function (ip: (nil))
I recently bumped into a similar issue in a code i wrote. I had to quit julia to stop the threads from running
Ctrl-C is really hard because a large proportion of code is unsafe for async / non-cooperative cancellation by default. For example, there's all sorts of places in Base where we might be holding locks or other resources which aren't precisely protected by a try-catch when the "impossible" happens and a signal is received between creating the resource and "immediately" protecting it. I'm thinking of code like
lk = lock(obj)
# < what happens if we're interrupted here ?
try
f()
finally
unlock(lk)
end
cf. the Java Thread.stop()
debacle.
Cancellation can be made safe by having a small number of well defined and documented cooperative cancellation points (eg, IO). This is what pthreads do (see man pthreads
"Cancellation points"). But this can result in Ctrl-C not actually cancelling the task for quite some time. Which isn't what you really want.
Structured concurrency helps a bit because it gives a systematic way for cleanup to propagate during cancellation. But in itself I don't think it helps resolve the Ctrl-C now-or-later, unsafe-or-safe conundrum.
Yep. We even actually already use cancellation points for this, it's just also not sufficient and causes other problems (such as, in the pthreads case, being unable to close file descriptors). Refs https://github.com/JuliaLang/julia/issues/6283
We just really need to stop having Ctrl-C throw regular exceptions
One way to do this is to have Ctrl-C set a flag which is checked at cancellation points. That's well and good, but it does mean Ctrl-C won't cancel things right now, but rather at some later time. Possibly much later, or never if you happen to have written a tight infinite loop!
Any thoughts on how we could handle this? One option might be to extend our existing double-Ctrl-C handling. Currently I recall we avoid delivering InterruptException
in ccall'd code which is expected to be unsafe for Julia exceptions. But even normal julia code is actually unsafe for InterruptException
! It's delivered asynchronously in a way that can't be easily modeled by programmers (or by the compiler?).
Yeah, I agree that there are problems outside of what structured concurrency can do. But my point is that, even if you can magically solve the problems you mentioned, there are a bunch of problems that are hard to solve without structured concurrency.
lk = lock(obj) # < what happens if we're interrupted here ? try
I think this is why we should be recommending lock(...) do
instead of manual try
-finally
. Inside of lock(f, ...)
implementation, each lock can use some very low-level compiler machinery to ask not to insert cancellation point within the critical region.
Which isn't what you really want.
I think it's unavoidable in a performance-oriented language like Julia. Surely nobody wants random cancellation points in their carefully-written tight loops. Using only the I/O operation as the cancellation point and letting people manually opt-in by yield
or something sounds like a good compromise.
I think this is why we should be recommending
lock(...) do
instead of manualtry
-finally
.
Absolutely! (The Base implementation of lock(f, lk)
is exactly the code I quoted, but of course that could be fixed ;-) )
When Ctrl-C'ing multi-threaded code, it crashes Julia altogether.