Open valassi opened 3 years ago
Hi @oliviermattelaer as we briefly discussed today:
ok for the moment to stay with the preent approach, define one "AVX" mode at build time, so that the right options can be switched on in the Makefiles
interesting eventually to use this "fat binary" approach, where the binaries support more than one AVX mode all linked together
This is also related to Makefile cleanup #362
One important point, relevant to the Bridge:
So in practice a multiSIMD C++ library is perfectly compatible with the Fortran MadEvent integration through the Bridge
This is a spinoff of vectorisation issue #71 and a followup to the big PR #171.
(The first part of this description also serves as documentation of what is available there now!).
The current vectorisation infrastructure supports five SIMD modes, which correspond to different -march options:
Note that the above flags are GLOBAL. They are applied to all files in src and in the PSigna directory.
In the code, #ifdef's for SSE4_2__, _AVX2, AVX512VL and MGONGPU_PVW512 determine how the code is built (i.e. they determine the neppV parameter, see issue #176). Note in particular that
Note also that the code already does have a basic check to fail gently if the desired avx mode is not supported by the present hardware. (This was added after a few tests crashed on the github CI - as the CI seems to have some AVX512 nodes but also some nodes that do not support it).
Presently, the build infrastructure for the vectorized builds is controled by two optional external parameters AVX and USEBUILDDIR and it works as follows
(The second part of this description below is a possible proposed change)
An alternative to the model above is to build a single (larger) executable supporting multi-simd mode:
In practice, one whould choose however how much of the implementation must be duplicated
It should be noted however that these large multi-mode binaries are not typically what the LHC experiments do in their builds
One advantage of this multi-mode build could be for studies of becnhamrking (see issue #157).
Very low priority. Probably not be implemented at all. I file this in any case so I do not forget (and esepcially Iadded the documentation of how this works now).