JuliaSIMD / ThreadingUtilities.jl

Utilities for low overhead threading in Julia.
MIT License
17 stars 4 forks source link

Print a warning if the kernel is allowed to move threads about #18

Closed DilumAluthge closed 3 years ago

DilumAluthge commented 3 years ago

This PR will print a warning if the kernel is allowed to move threads about.

This is based on my reading of https://github.com/JuliaLang/julia/blob/master/doc/src/manual/environment-variables.md#julia_exclusive and https://github.com/JuliaLang/julia/blob/master/src/threading.c (in particular, starting at line 415: https://github.com/JuliaLang/julia/blob/efc986003356be5daea6ccd9e38008de961c18aa/src/threading.c#L415). If I understand correctly, Julia will allow the kernel to move threads around if and only if the JULIA_EXCLUSIVE environment variable is 0.

chriselrod commented 3 years ago

If set to anything besides 0, then Julia's thread policy is consistent with running on a dedicated machine: the master thread is on proc 0, and threads are affinitized. Otherwise, Julia lets the operating system handle thread policy.

Hmm, if set to anything besides 0, but what if it isn't set at all?

chriselrod commented 3 years ago

Testing from the command line:

> JULIA_EXCLUSIVE=1 jm --startup=no -e 'run(`taskset -p $(getpid())`)'
pid 73066's current affinity mask: 1
> jm --startup=no -e 'run(`taskset -p $(getpid())`)'
pid 73110's current affinity mask: ff
> JULIA_EXCLUSIVE=0 jm --startup=no -e 'run(`taskset -p $(getpid())`)'
pid 73159's current affinity mask: ff
DilumAluthge commented 3 years ago

If set to anything besides 0, then Julia's thread policy is consistent with running on a dedicated machine: the master thread is on proc 0, and threads are affinitized. Otherwise, Julia lets the operating system handle thread policy.

Hmm, if set to anything besides 0, but what if it isn't set at all?

This line (https://github.com/JuliaLang/julia/blob/efc986003356be5daea6ccd9e38008de961c18aa/src/options.h#L133) suggests that the default is 0, so if it is not set, presumably it behaves as if it were 0.

Some quick experiments locally seem to suggest that not being set is the same as being set to 0.

chriselrod commented 3 years ago

So if JULIA_EXCLUSIVE is not set, then it doesn't set affinity. You have to explicitly set it to something other than 0.

I don't think it should be warning people under default settings before we actually modify code to take advantage of JULIA_EXCLUSIVE.

DilumAluthge commented 3 years ago

Maybe change to a debug for now? So it's hidden by default.

What about the Threads.nthreads() < Sys.CPU_THREADS()? Keep that as a warning, or change it to a debug?

codecov-io commented 3 years ago

Codecov Report

Merging #18 (2256945) into master (92295da) will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##            master       #18   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            4         5    +1     
  Lines          105       131   +26     
=========================================
+ Hits           105       131   +26     
Impacted Files Coverage Δ
src/ThreadingUtilities.jl 100.00% <100.00%> (ø)
src/warnings.jl 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 92295da...2256945. Read the comment docs.

chriselrod commented 3 years ago

Yeah, we could have that as a debug.

On the comparison with Sys.CPU_THREADS, I'm not sure. Someone could get warnings from both Octavian and ThreadingUtilities, but I think that's okay. We should add the option to silence the warning, perhaps with an environmental variable, e.g. in my Octavian PR, I have:

if nt < NUM_CORES && ("SUPPRESS_OCTAVIAN_WARNING" ∉ keys(ENV))

so that if someone is annoyed by the warning but doesn't want to set the number of threads, they can turn the warning off. Maybe they're using Distributed.

DilumAluthge commented 3 years ago

Yeah I have THREADINGUTILITIES_SUPPRESS_NTHREADS_WARNING here.

I'll change it so that it suppresses the warning if the user sets either THREADINGUTILITIES_SUPPRESS_NTHREADS_WARNING or SUPPRESS_OCTAVIAN_WARNING (or both).

DilumAluthge commented 3 years ago

Actually, what is the recommendation for Octavian number of threads? It's num cores + 1, right?

That conflicts with the rec here. I'll just take the rec out from here. It makes more sense for downstream libraries (e.g. Octavian) to make recommendations on thread numbers.

chriselrod commented 3 years ago

Actually, what is the recommendation for Octavian number of threads? It's num cores + 1, right?

The recommendation is (at least) num_cores threads. It will only use up to num_cores threads, so more is fine if other threaded code you're running likes more threads.

Note that it runs tasks on num_cores-1 threads, and then runs on the main thread too. That is the recommended way to use ThreadingUtilities, because ThreadingUtilities.__wait is a spin-loop.

DilumAluthge commented 3 years ago

I got rid of the nthreads warning. It makes more sense for the libraries that actually use threads (e.g. Octavian) to tell users how many threads they need.

The affinity warning is a good one, so I kept it. But I changed it to a debug. We can change it back to a warning in the future once we can take advantage of core affinity.

chriselrod commented 3 years ago

Okay, I said it's fine because the downstream libraries like Octavian should be able to choose to use less threads.

chriselrod commented 3 years ago

Looks good to me -- merge when all is green?

DilumAluthge commented 3 years ago

I don't feel too strongly either way. We can add it back later. Since your Octavian PR is going to have an nthreads warning, I think that's probably sufficient for now.

Honestly, I'm just jazzed that we have a way of preventing the kernel from migrating threads between different cores.

DilumAluthge commented 3 years ago

Looks good to me -- merge when all is green?

Sounds good!

DilumAluthge commented 3 years ago

Honestly, I'm just jazzed that we have a way of preventing the kernel from migrating threads between different cores.

This should open up some good opportunities down the road.

DilumAluthge commented 3 years ago

@chriselrod You can go through and cancel all the jobs except the jobs on the most recent commit.

Travis had an option to do this automatically... GHA does not :(

chriselrod commented 3 years ago

Looks like the log tests are failing.

DilumAluthge commented 3 years ago

@chriselrod Whew, looks like CI is finally green.

chriselrod commented 3 years ago

Great, thanks