We currently only build scalar and AVX on Travis CI because otherwise we run into the 50 minute limit that we have there. AVX2 and KNL need much more time because there is more kernel code for them (half-prec, more SoA length combinations).
Without the twisted boundary conditions we were able to compile for KNL, now the amount of kernel code is around 16-fold. Though we want to have this in production, it will probably not hurt us much if we just compile one particular variant of the twisted boundary conditions to reduce compile time. We cannot run the KNL version on Travis CI anyway.
My proposal would be to add one option to the main CMake file that allows to reduce the amount of code for testing. This would be undocumented and toggling it would give out a warning. Then in the step where the kernels are generated, the Python/Jinja part would just replace 15 of 16 kernels with nothing, removing the code from compilation. Since all the kernels are void-functions, this should not change anything, except “unused variable” warnings. The code would compile but of course give the wrong results.
However, we could still have more testing done which would fix issues where single-prec and double-prec works, but somehow something for half-prec got messed up.
I estimate that this does not take much time to implement (optimistic: half an hour), but the benefits would be well worth it. If nobody objects to the little added complexity in the build process, I would add that tomorrow.
We currently only build scalar and AVX on Travis CI because otherwise we run into the 50 minute limit that we have there. AVX2 and KNL need much more time because there is more kernel code for them (half-prec, more SoA length combinations).
Without the twisted boundary conditions we were able to compile for KNL, now the amount of kernel code is around 16-fold. Though we want to have this in production, it will probably not hurt us much if we just compile one particular variant of the twisted boundary conditions to reduce compile time. We cannot run the KNL version on Travis CI anyway.
My proposal would be to add one option to the main CMake file that allows to reduce the amount of code for testing. This would be undocumented and toggling it would give out a warning. Then in the step where the kernels are generated, the Python/Jinja part would just replace 15 of 16 kernels with nothing, removing the code from compilation. Since all the kernels are
void
-functions, this should not change anything, except “unused variable” warnings. The code would compile but of course give the wrong results.However, we could still have more testing done which would fix issues where single-prec and double-prec works, but somehow something for half-prec got messed up.
I estimate that this does not take much time to implement (optimistic: half an hour), but the benefits would be well worth it. If nobody objects to the little added complexity in the build process, I would add that tomorrow.