BOINC / boinc

Open-source software for volunteer computing and grid computing.
https://boinc.berkeley.edu
GNU Lesser General Public License v3.0
2.02k stars 446 forks source link

Add AVX-512 detection #3180

Closed adamradocz closed 10 months ago

adamradocz commented 5 years ago

Describe the problem The Boinc Client currently detects AVX and AVX2 instruction capabilities, but not the AVX-512. Intel says about this new instruction set:

Intel® AVX-512 is a set of new instructions that can accelerate performance for workloads and usages such as scientific simulations, financial analytics, artificial intelligence (AI)/deep learning, 3D modeling and analysis, image and audio/video processing, cryptography and data compression.2

The first CPU with AVX-512 instruction was released in 2015.

sirzooro commented 4 years ago

AVX512 contains multiple subsets, at this moment there are 20 subsets defined. BOINC probably should detect and report each of them independently. It also may do some grouping, like it is done on this Wiki page. Or maybe detect and report presence of most basic subsets only - F, CD, VL, BW, DQ. https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512

fsbruva commented 4 years ago

If I (FNG) might make an observation & suggestion: There are currently several mechanisms used within BOINC for detecting CPU capabilities across Linux, Win and macOS, and all produce slightly different output for the same processor. Is there any value in using this as an opportunity to harmonize the mechanisms and instruction sets into a single location, so that all OS's can benefit? I'd be happy to help with the effort, and but only if it's yak shaving.

SETIguy commented 4 years ago

We've wrestled with this in the past with MacOS versions. The solution then was to add the linux feature strings for elements that disagreed to the MacOS feature string. ('pni' was the one I recall but there are others). So yes, I suggest we standardize on the (lower case) linux cpuinfo names for processor features. People running projects shouldn't need to deal with multiple names for the same feature when creating their plan classes. We could optionally include platform specific names as well, as we do in the case of MacOS, but I would personally prefer the alternate names be deprecated.

On Tue, May 5, 2020 at 2:30 PM fsbruva notifications@github.com wrote:

If I (FNG) might make an observation & suggestion: There are currently several mechanisms used within BOINC for detecting CPU capabilities across Linux, Win and macOS, and all produce slightly different output for the same processor. Is there any value in using this as an opportunity to harmonize the mechanisms and instruction sets into a single location, so that all OS's can benefit? I'd be happy to help with the effort, and but only if it's yak shaving.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/BOINC/boinc/issues/3180#issuecomment-624317071, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACS5ZMXZ3U5IRLHOQZEX7R3RQCAQXANCNFSM4HVC5PEQ .

-- Eric Korpela korpela@ssl.berkeley.edu AST:7731^29u18e3

fsbruva commented 4 years ago

As a proponent of the "plan the work, work the plan," how would you suggest we begin the design work? Some operational requirements that need to be decided (of course amended later):

  1. Are there any flags that shouldn't be reported? (hypervisor? acpi?)
  2. Will the linux processor feature name list be appropriate? Or will runtime detection by app on target OS be problematic?
  3. What are the parameters?
  4. (derivative from previous question) Should the function perform the cpuid access/functions for all OSs?
fsbruva commented 4 years ago

AVX512 contains multiple subsets, at this moment there are 20 subsets defined. BOINC probably should detect and report each of them independently.

I agree with this. As long as it can be detected via CPUID bit and has computational importance or influences apps/work allocation, then yes.

sirzooro commented 4 years ago

AVX512 contains multiple subsets, at this moment there are 20 subsets defined. BOINC probably should detect and report each of them independently.

I agree with this. As long as it can be detected via CPUID bit and has computational importance or influences apps/work allocation, then yes.

Yes, every subset has its own bit in CPUID: https://en.wikipedia.org/wiki/CPUID

List of supported subsets depends on CPU model, so all of them should be reported. Page linked earlier has table which lists this: https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512

fsbruva commented 4 years ago

Yup. got it. New questions

  1. Does it make sense to deprecate the use of /proc/cpuinfo on Linux? As the new function would be cross platform, we could just cut to the quick and do direct machine code CPUID calls, and avoid all OS dependence.
  2. How stringent shall we be on efficiency? I.E. If we don't care about CPU manufacturer, we can easily loop all the possible feature flags. We trade efficiency on client start up for increased code simplicity/maintenance and completeness of features reported.
CharlieFenton commented 4 years ago

I'm not familiar with the CPUID instruction, but a quick Google search implies that the only way to invoke it on Mac OS X is via assembly code. Here is a discussion about that.

sirzooro commented 4 years ago

/proc/cpuinfo was proposed as a source of feature names, which should be used for all OSes running on x86/x86_64. BOINC itself should use CPUID instruction itself, it is easier to use it than parsing /proc/cpuinfo file.

I have recalled that CPUID is is not enough, as SSE/AVX/AVX512 vectors also need support on the OS side. This can be checked using XGETBV instruction. BOINC should check if OS supports AVX and AVX512 vectors. I am not sure if it also should do this for SSE - I have released few versions of optimized apps for TN-Grid, RakeSearch and AcousticsAtHome, and no one reported any issue related to this. On the other hand I recall case when someone was trying to run AVX app on OS which does not have support for it.

On Linux you can use header, which contains handy wrappers for CPUID. However you still need to use assembly code for XGETBV. The same is for MinGW gcc on Windows. Example how to do this is here.

I recall that MSVC also has some support for CPUID and XGETBV, but I never used it. I do not know about MacOS, I never used it.

fsbruva commented 4 years ago

Good point about the CPUID dependencies. What you're discussing is modeled fairly well in hostinfo_win.cpp

RE: CPUID access, here's the current implementation, by platform:

New questions:

  1. Should this functionality become part of the library ("general utility" from the coding standards)? Or a separate file/header with all the various precompiler stuff that all platforms use?