JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.61k stars 5.48k forks source link

weird interaction between `sleep` function and Threads #43952

Open serhii-havrylov opened 2 years ago

serhii-havrylov commented 2 years ago

This issue was originally mentioned here.

function ex1()
    task = @Threads.spawn begin
        st = time()
        sleep(1.0)
        return time() - st
    end

    st2 = time()
    while time() - st2 < 5 end

    println("task slept for $(fetch(task)) seconds")
end

ex1()
ex1()
ex1()
ex1()
ex1()
ex1()

gives the following result

task slept for 4.993403911590576 seconds
task slept for 5.000104904174805 seconds
task slept for 5.000086784362793 seconds
task slept for 5.000097990036011 seconds
task slept for 5.000089883804321 seconds
task slept for 5.000094175338745 seconds

it works as expected if the sleep function is replaced with a while loop

function ex2()
    task = Threads.@spawn begin
        st = time()
        while time() - st < 1
        end
        return time() - st
    end

    st2 = time()
    while time() - st2 < 5
    end

    return println("task slept for $(fetch(task)) seconds")
end

ex2()
ex2()
task slept for 1.0 seconds
task slept for 1.0 seconds

I don't really know the under-the-hood details of Julia multithreading, but I would assume that there is some weird interaction with thread no.1 (the main julia thread) inside the sleep function, since it was designed for the async tasks; hence it “blocks”(yields back to the event loop, or whatever it is called in julia) the spawned task and doesn’t continue working until the main thread yields back, which doesn’t happen until it calls fetch (exactly after 5 seconds).

julia> VERSION
v"1.7.1"
tkf commented 2 years ago

I brought it up for discussion with @jpsamaroo and @vchuravy. My understanding is that, in ex1, the task is not re-scheduled when sleep(1.0) is finished since thread 1 is in the non-yielding busy loop. It'd be reasonable to expect that task will be migrated to thread 2. But Julia runs I/O loop (including sleep callback) only in thread 1 normally unless you are in Threads.@threads for loop. Indeed ex1 finishes much faster inside of Threads.@threads for.

julia> Threads.nthreads()
2

julia> ex1()
task slept for 4.99216890335083 seconds

julia> ex1()
task slept for 5.000109910964966 seconds

julia> Threads.@threads for _ in 1:1
           ex1()
       end
task slept for 1.002121925354004 seconds

julia> Threads.@threads for _ in 1:1
           ex1()
       end
task slept for 1.002121925354004 seconds

Of course, this is a rather ugly hack. The best option usually is to put yield in the busy loop like this (or just don't spin). But I think it's reasonable to provide a hint to prefer I/O in a given block of code. See: https://github.com/JuliaLang/julia/pull/43919#discussion_r791813657

vtjnash commented 2 years ago

@tkf This is what ccall(:jl_enter_threaded_region, Cvoid, ()) solves, which you have been asking me about.

carstenbauer commented 2 years ago

which you have been asking me about.

FWIW, if you're referring to the slack question from earlier today, that was me who was asking :)

tkf commented 2 years ago

Well, I was also asking this before :)

But I figured out how threaded region works and that's why I tried @threads for. (Or rather it became clear with discussion with @jpsamaroo and @vchuravy that the issue here is the lack of threaded region).

I wonder if we want to have something like

function prefer_io(f)
    ccall(:jl_enter_threaded_region, Cvoid, ())
    try
        f()
    finally
        ccall(:jl_exit_threaded_region, Cvoid, ())
    end
end

to work around the case like the OP? It's ugly since it changes how the scheduler works globally but it at least works on arbitrary threads now.