Open algunion opened 1 year ago
sleep has no upper bound on timeout (unless you are running a realtime kernel on custom hardware)
sleep has no upper bound on timeout (unless you are running a realtime kernel on custom hardware)
We can agree on that - but the mere not having an upper bound on timeout state of affairs doesn't seem to explain the behavior pointed out in the MWE I provided.
Please note that I am not questioning the equality @time sleep(x) == x seconds
in some extreme or naive sense (e.g., sleep(x) should always take at most x seconds
). I suggest looking at the output of task11
vs. task22
and task33
in the provided MWE.
From what I am observing, the issue I raised might have nothing to do with the actual duration of sleep(x)
- it might very well be related to whether sleep
calls are blocked/delayed if issued from tasks running on different threads when the main thread is busy (as you can see in the MWE - task11
with an idle main is performing as expected, but not task22
and task33
which are running while the main thread is busy).
I also acknowledge that your comment might have some technical implications I am unaware of.
P. S. I updated the initial issue body to decrease the chance of confusing this with some naive why doesn't sleep(5)
take exactly 5 seconds issue.
Julia uses a libuv based event-loop under the hood. Processing certain things like Timers/IO depend on the event-loop being run regularly.
(sleep
uses a libuv timer under the hood.)
Looking at jl_process_events
it seems like the event loop is only run from tid == 0
or when _threadedregion
is set.
jl_enter_threaded_region
is only called from threading_run
(which is the base function for @threads
).
I am unsure why we still have this mechanism instead of allowing any thread to run the libuv event loop.
I really hope we can run the libuv
event loop from any thread soon.
Especially for the :interactive
threadpool, this issue can be really annoying: for a non-blocking HTTP.jl
server running on the :interactive
threadpool, small bursts of work on the main thread can (always) delay and even deny the requests (depending on timeout).
But even if no work is done - spawning lots of tasks (like HTTP.jl
does for new connections) can cause concurrent access to the libuv
event loop from multiple threads - so this issue is pretty big regardless of being careful not to run non-trivial work on the main thread.
But more generally, the interactive tasks are heavily impacted by this if there is some Timer/IO work involved (and it is awkward always to be careful not to run anything on the main if responsivity of the :interactive
threadpool is desired - well, that is always desired because that is the point of :interactive
addition). But as I said - in a multithreading setup, if a large number of tasks are using Timer, this issue becomes visible even if no work is done on the main thread.
Let me know if more MWEs are needed in the above context.
On master with the suggested snippet from https://github.com/JuliaLang/julia/pull/53422#issue-2147679408
julia> function make_io_thread()
tid = UInt[0]
threadwork = @cfunction function(arg::Ptr{Cvoid})
@ccall jl_set_io_loop_tid((Threads.threadid() - 1)::Int16)::Cvoid
wait() # spin uv_run as long as needed
nothing
end Cvoid (Ptr{Cvoid},)
err = @ccall uv_thread_create(tid::Ptr{UInt}, threadwork::Ptr{Cvoid}, C_NULL::Ptr{Cvoid})::Cint
err == 0 || Base.uv_error("uv_thread_create", err)
@ccall uv_thread_detach(tid::Ptr{UInt})::Cint
err == 0 || Base.uv_error("uv_thread_detach", err)
# n.b. this does not wait for the thread to start or to take ownership of the event loop
nothing
end
make_io_thread()
the MWE from above produces:
[ Info: phase 1: control run - no main thread work
[ Info: worker 1 executing on thread 5
[ Info: sleeper 1 executing on thread 16
sleeper 1: 10.033890 seconds (109.12 M allocations: 6.504 GiB, 2.50% gc time)
worker 1: 10.213286 seconds (110.98 M allocations: 6.615 GiB, 2.53% gc time, 0.13% compilation time)
****************************************
[ Info: phase 2: main thread work without yield
[ Info: worker 2 executing on thread 5
[ Info: sleeper 2 executing on thread 15
[ Info: mainspin executing on thread 1
worker 2: 10.012252 seconds (119.87 M allocations: 7.145 GiB, 3.50% gc time, 0.13% compilation time)
sleeper 2: 10.024603 seconds (119.98 M allocations: 7.151 GiB, 3.51% gc time)
****************************************
[ Info: phase 3: main thread work with yield
[ Info: worker 3 executing on thread 2
[ Info: sleeper 3 executing on thread 12
[ Info: mainspin executing on thread 1
sleeper 3: 10.015027 seconds (113.49 M allocations: 6.764 GiB, 3.67% gc time)
worker 3: 10.025508 seconds (113.62 M allocations: 6.772 GiB, 3.67% gc time, 0.17% compilation time)
It seems that the
sleep
function behavior when used in tasks running on different threads is heavily impacted by work taking place on the main thread.Given the documentation of
sleep
function you would expect that@time sleep(1)
will result in approximately1-second
being reported by the@time
evaluator. In the same way, you would expect that the following will take about 10-seconds in total:However, the above expectation is violated (in a manner not explainable by the expected/usual variability in the
sleep
and/or@time
behavior) whensleep
is called from tasks running on different threads and any of the following two conditions are met:Libc.systemsleep(n)
is called on the main thread (showcased in my MWE)sleep
concurrently (increasing the number of available threads results in worse results).To give a clear example:
Output like this would look normal:
However, the function
sleeper(1, 10, 1)
running in tasks spawned on different threads than main, can output values like the one below (when main thread is busy):Or if you make the main really busy for a longer period of time:
And this is not about some weird print to
stdout
competition between tasks (we also have acontrol
function,realworker
, not usingsleep
to confirm my statement).All this might make more sense if you run the MWE provided below.
Important note: the current issue is to be interpreted in the context of Julia being started with multiple threads. For consistency, let's assume
julia -t 6,3 script.jl
.Main MWE follows:
Output with
julia -t 6,3 script.jl
:It would be speculative on my part to derive hard conclusions about the root cause of this issue. There is a discourse topic that I open and some users are proposing some hypotheses - feel free to check it out.
The underlying cause of the above issue seems to extend to various packages (for example, non-blocking listen from
HTTP.jl
responsivity is highly dependent on the main threat not doing work - and that is not caused by the directsleep
usage). Essential to mention thatHTTP.serve!/HTTP.listen!
is spawned to the:interactive
threadpool.Libc.systemsleep MWE version
The `Libc.systemsleep` version of the above MWE
```julia using Base.Threads, Dates @info "started with $(nthreads()) threads" @info "interactive threadpool: $(nthreads(:interactive))" @info "default threadpool: $(nthreads(:default))" println(repeat("*", 40)) function mainspin(s, c; withyield=false) @info "mainspin executing on thread $(threadid()) " r = 0.0 counter = 0 while counter < c Libc.systemsleep(s) counter += 1 if withyield yield() end end end function sleeper(s, c, id) @info "sleeper $id executing on thread $(threadid()) " counter = 0 @time "sleeper $id" while counter < c sleep(s) counter += 1 end end function libcworker(s, c, id) @info "worker $id executing on thread $(threadid()) " r = 0.0 counter = 0 @time "worker $id" while counter < c Libc.systemsleep(s) counter += 1 yield() end end begin # phase 1 @info "phase 1: control run - no main thread work" # yields 10 times (total time ~10 seconds) task1 = @spawn libcworker(1.02, 10, 1) # 1.02 because weird @time printing overlap # yields 10 times (total time ~10 seconds) task11 = @spawn sleeper(1, 10, 1) wait(task1) wait(task11) # we waited for task1 and task11 to finish # phase 2 println(repeat("*", 40)) @info "phase 2: main thread work without yield" # yields 10 times (total time ~10 seconds) task2 = @spawn libcworker(1, 10, 2) # yields 10 times (total time ~10 seconds) task22 = @spawn sleeper(1, 10, 2) # we wait 5 seconds before starting to spin # the main thread sleep(5) # now we start spinning # while task2 and task22 are already running mainspin(1, 7) # main thread work does not impact the real worker # total time ~10 seconds wait(task2) # main thread work heavily impacting the sleeper # total time >15 seconds wait(task22) # phase 3 println(repeat("*", 40)) @info "phase 3: main thread work with yield" # yields 10 times (total time ~10 seconds) task3 = @spawn libcworker(1, 10, 3) # yields 10 times (total time ~10 seconds) task33 = @spawn sleeper(1, 10, 3) # we wait 5 seconds before starting to spin # the main thread sleep(5) # now we start spinning with yield # while task3 and task33 are running mainspin(1, 7, withyield=true) # main thread work does not impact # the real worker - total time ~10 seconds wait(task3) # main thread work (with yield) still impacting the sleeper # but we consistently get better times (total time ~13 seconds) wait(task33) end ``` Output: ``` [ Info: started with 6 threads [ Info: interactive threadpool: 3 [ Info: default threadpool: 6 **************************************** [ Info: phase 1: control run - no main thread work [ Info: worker 1 executing on thread 4 [ Info: sleeper 1 executing on thread 5 sleeper 1: 10.021097 seconds (50 allocations: 1.406 KiB) worker 1: 10.202331 seconds (7.16 k allocations: 527.908 KiB, 0.89% compilation time) **************************************** [ Info: phase 2: main thread work without yield [ Info: sleeper 2 executing on thread 8 [ Info: worker 2 executing on thread 6 [ Info: mainspin executing on thread 1 worker 2: 10.001909 seconds (12.27 k allocations: 864.121 KiB, 0.36% compilation time) sleeper 2: 17.009272 seconds (28.83 k allocations: 1.998 MiB, 0.21% compilation time) **************************************** [ Info: phase 3: main thread work with yield [ Info: sleeper 3 executing on thread 5 [ Info: worker 3 executing on thread 7 [ Info: mainspin executing on thread 1 worker 3: 10.001954 seconds (2.62 k allocations: 246.360 KiB, 0.28% compilation time) sleeper 3: 13.026204 seconds (30.55 k allocations: 2.179 MiB, 0.22% compilation time) ```