flame / blis

BLAS-like Library Instantiation Software Framework
Other
2.3k stars 367 forks source link

OMP_NUM_THREADS discouraged? Why? #825

Open zerothi opened 2 months ago

zerothi commented 2 months ago

I am reading your wiki pages with comments on environment variables. Then I stumble upon:

https://github.com/flame/blis/blob/master/docs/Multithreading.md#environment-variables-the-automatic-way

Quote:

Note: We highly discourage use of the OMP_NUM_THREADS environment variable to specify multithreading within BLIS and may remove support for it in the future. If you wish to set parallelism globally via environment variables, please use BLIS_NUM_THREADS.

Now, while I can see that having dedicated environment variables for the project, I think this goes largely against the end-user base.

At least from my perspective I do this:

  1. The HPC admins compile and test the software and selects the appropriate libraries for performance.
  2. The software compiled is made available for end-users, and hopefully one can pass generic information that lets them test and benchmark their code/use for their needs.

However, here comes the problem. When BLIS is compiled with OpenMP support, I would very much like to be able to tell our users about OMP_NUM_THREADS, as a first entry point to play with threading/parallelism. This is the de-facto standard for end-users, and also the other related OMP flags, OMP_PLACES, OMP_PROC_BIND.

I would like the hierarchy to be simple:

BLIS_NUM_THREADS=X OMP_NUM_THREADS=Y  # use X
OMP_NUM_THREADS=Y  # use Y
BLIS_NUM_THREADS=X  # use X

In this way, users won't be troubled by odd naming conventions that means nothing to them.

PS. I know that fine-tuning the other BLIS specific env's is necessary, but most users won't bother with these details.

fgvanzee commented 2 months ago

@zerothi Thanks for your feedback.

The following scenarios you provided

BLIS_NUM_THREADS=X OMP_NUM_THREADS=Y  # use X
OMP_NUM_THREADS=Y  # use Y
BLIS_NUM_THREADS=X  # use X

do indeed reflect how BLIS response to those environment variables. First, the library checks BLIS_NUM_THREADS; if it is set, that value is used. If it is not set, the library checks for OMP_NUM_THREADS; if it set, that value is used. If it is not set, then the number of total threads (not the concept, but the single user-facing number) is left unspecified.

Now, while I can see that having dedicated environment variables for the project, I think this goes largely against the end-user base.

I understand and appreciate your perspective. You may have already thought of this, but let me explain why I personally prefer that users use a BLIS-specific environment variable: it forces the user to consciously know what they are getting. What I don't like about reading OMP_NUM_THREADS is that we can't be sure of the user's intent. Did they want it to specify BLIS parallelism only? Or application parallelism only? What about application parallelism that calls BLIS? Should BLIS still multithread in that situation, even if there is risk of over-subscription? Using BLIS_NUM_THREADS eliminates most/all of this ambiguity.

Let me know if you have any other questions.

devinamatthews commented 2 months ago

I guess I should add that none of the devs currently have any plans to deprecate OMP_NUM_THREADS. We could potentially commit to keeping it as an option if uncertainty about future versions is off-putting.

fgvanzee commented 2 months ago

I guess I should add that none of the devs currently have any plans to deprecate OMP_NUM_THREADS. We could potentially commit to keeping it as an option if uncertainty about future versions is off-putting.

Sure. I'm fine with removing the language about potential deprecation if that's what is bothering @zerothi.

zerothi commented 2 months ago

@zerothi Thanks for your feedback.

The following scenarios you provided

BLIS_NUM_THREADS=X OMP_NUM_THREADS=Y  # use X
OMP_NUM_THREADS=Y  # use Y
BLIS_NUM_THREADS=X  # use X

do indeed reflect how BLIS response to those environment variables. First, the library checks BLIS_NUM_THREADS; if it is set, that value is used. If it is not set, the library checks for OMP_NUM_THREADS; if it set, that value is used. If it is not set, then the number of total threads (not the concept, but the single user-facing number) is left unspecified.

Indeed.

Now, while I can see that having dedicated environment variables for the project, I think this goes largely against the end-user base.

I understand and appreciate your perspective. You may have already thought of this, but let me explain why I personally prefer that users use a BLIS-specific environment variable: it forces the user to consciously know what they are getting. What I don't like about reading OMP_NUM_THREADS is that we can't be sure of the user's intent. Did they want it to specify BLIS parallelism only? Or application parallelism only? What about application parallelism that calls BLIS? Should BLIS still multithread in that situation, even if there is risk of over-subscription? Using BLIS_NUM_THREADS eliminates most/all of this ambiguity.

I am not advocating removing the BLIS env vars which can be very useful in the cases you mention. So that should definitely still be there!

I guess I should add that none of the devs currently have any plans to deprecate OMP_NUM_THREADS. We could potentially commit to keeping it as an option if uncertainty about future versions is off-putting. Great, this was my main concern.

I guess I should add that none of the devs currently have any plans to deprecate OMP_NUM_THREADS. We could potentially commit to keeping it as an option if uncertainty about future versions is off-putting.

Sure. I'm fine with removing the language about potential deprecation if that's what is bothering @zerothi.

This was actually my goal here ;)