Closed Vika-F closed 3 weeks ago
Thanks for writing up this doc, it's very helpful.
A couple question after a quick look:
cpp/daal
, but isn't that meant to be deprecated in favor of cpp/oneapi/dal
?DAAL_INTEL_CPP_COMPILER
? Aren't those also supported by GCC and CLANG when compiled with -march=avx512
?@david-cortes-intel
Thanks for the prompt review!
- It suggests to put files under
cpp/daal
, but isn't that meant to be deprecated in favor ofcpp/oneapi/dal
?
No, daal only deprecated as the API. but all the computational kernels for CPUs, otherwise, are implemented in cpp/daal
.
And cpp/oneapi/dal
only provides the new APIs and doesn't contain actual implementations for CPUs.
Someday the chapter about high-level oneDAL folders structure and what is located where and how all the parts connect will also be added I hope.
- Why are AVX512 intrinsics conditional on
DAAL_INTEL_CPP_COMPILER
? Aren't those also supported by GCC and CLANG when compiled with-march=avx512
?
It is probably because most of the CPU-specific functionality like intrinsics are compiler-specific.
@keeranroth and @rakshithgb-fujitsu, can you please take a look at this chapter? It considers only x86 for now, but the similar code structure might be implemented for RISC-V and ARM to support various instruction set architectures.
more of a question rather than a suggestion, the service defines for compiler macros defined here - https://github.com/oneapi-src/oneDAL/blob/main/cpp/daal/src/services/service_defines.h specifically regarding the ones that are mentioned for GNU and others, they don't really translate to any compiler hints. Does this mean that only icx
compiler can leverage those hints in its current state?
Going forward since multiple architectures are supported, the compiler hints might be architecture specific, how would this be handled?
@rakshithgb-fujitsu Thanks for the prompt response!
Yes, the sections for GNU and VS compilers do not have definitions for SIMD-related pragmas. That's why there is no such pragmas or analogues in GNU and VS compilers. Yes, for now only intel compilers will use those guidances.
But we are trying to guide other compilers as well where possible. You can see DAAL_PREFETCH
and DAAL_FORCEINLINE
definitions later in that file, for example.
Regarding the instruction set architecture (ISA) specific definitions, there is no problems with defining those. As all the ISA-specific definitions must be put under the respective defines. For example:
#if (__CPUID__(DAAL_CPU) == __avx512__)
// AVX-512 specific code goes here
#endif
So, all the ISA-specific definitions would also go under the respective defines.
I've tried to describe that in the chapter, but it seems I need to improve that part to make it more clear.
Some like "ivdep" and "novector" do have equivalents in other compilers nowadays though - for example, there's #pragma GCC ivdep
which is also recognized by clang; and #pragma loop(ivdep)
for MSVC.
@david-cortes-intel
Some like "ivdep" and "novector" do have equivalents in other compilers nowadays though - for example, there's
#pragma GCC ivdep
which is also recognized by clang; and#pragma loop(ivdep)
for MSVC.
Good catch. It would be good to improve the definitions from GCC and MSVC in this case. I've created a task for this.
@keeranroth , @david-cortes-intel I think I've addressed all the comments. Can you please take a look one more time?
@keeranroth , @david-cortes-intel I think I've addressed all the comments. Can you please take a look one more time?
Please remember about this point: https://github.com/oneapi-src/oneDAL/pull/2945#discussion_r1812249835
@david-cortes-intel
Please remember about this point: #2945 (comment)
Sorry, I've forgot about that. Thanks for pointing it to me. I've added a note about that: https://github.com/oneapi-src/oneDAL/pull/2945/files#diff-d3dd36089bea7ea0a85941dd0e1d91c4456b5e23bcc29fd23bdce9afbd130ed1R144
Please take a look.
This PR adds the chapter that describes how CPU features dispatching is implemented in oneDAL into the documentation.
Checklist to comply with before moving PR from draft:
PR completeness and readability