Xiangyu-Hu / SPHinXsys

SPHinXsys provides C++ APIs for engineering simulation and optimization. It aims at complex systems driven by fluid, structure, multi-body dynamics and beyond. The multi-physics library is based on a unique and unified computational framework by which strong coupling has been achieved for all involved physics.
https://www.sphinxsys.org/
Apache License 2.0
259 stars 199 forks source link

SIMD on CI runners: occurrence of "illegal" instructions #182

Closed FabienPean-Virtonomy closed 1 year ago

FabienPean-Virtonomy commented 1 year ago

The PR #176 unfortunately did not solve the "Illegal instructions" error popping up at random on the CI for Linux system

See the logs below for anyone wanting to dive into this problem:

logs_473.zip logs_458.zip logs_431.zip without ccache logs_458.zip

The following tests FAILED for logs 473: 49 - bernoulli_beam_struct_sim (ILLEGAL) 50 - mass_spring_damper_response_struct_sim (ILLEGAL) 52 - test_3d_beam_pulling_pressure_load (ILLEGAL) 55 - test_3d_heart_electromechanics (Failed) 61 - test_3d_particle_relaxation (ILLEGAL) 62 - test_3d_particle_relaxation_single_resolution (ILLEGAL) 65 - test_3d_pkj_lv_electrocontraction (Failed) 67 - test_3d_self_contact (Failed) 68 - test_3d_shell_particle_relaxation (ILLEGAL) 69 - test_3d_taylor_bar (Failed)

The following tests FAILED for logs 458: 49 - bernoulli_beam_struct_sim (ILLEGAL) 50 - mass_spring_damper_response_struct_sim (ILLEGAL) 52 - test_3d_beam_pulling_pressure_load (ILLEGAL) 55 - test_3d_heart_electromechanics (Failed) 61 - test_3d_particle_relaxation (ILLEGAL) 62 - test_3d_particle_relaxation_single_resolution (ILLEGAL) 65 - test_3d_pkj_lv_electrocontraction (Failed) 67 - test_3d_self_contact (Failed) 68 - test_3d_shell_particle_relaxation (ILLEGAL) 69 - test_3d_taylor_bar (Failed)

The following tests FAILED for logs 431: 42 - test_scalar_functions_particle_relaxation (ILLEGAL) 49 - bernoulli_beam_struct_sim (ILLEGAL) 50 - mass_spring_damper_response_struct_sim (ILLEGAL) 52 - test_3d_beam_pulling_pressure_load (ILLEGAL) 55 - test_3d_heart_electromechanics (Failed) 61 - test_3d_particle_relaxation (ILLEGAL) 62 - test_3d_particle_relaxation_single_resolution (ILLEGAL) 65 - test_3d_pkj_lv_electrocontraction (Failed) 67 - test_3d_self_contact (Failed) 68 - test_3d_shell_particle_relaxation (ILLEGAL) 69 - test_3d_taylor_bar (Failed)

The following tests FAILED for logs 458: 49 - bernoulli_beam_struct_sim (ILLEGAL) 50 - mass_spring_damper_response_struct_sim (ILLEGAL) 52 - test_3d_beam_pulling_pressure_load (ILLEGAL) 56 - test_3d_heart_electromechanics_particle_relaxation (ILLEGAL) 57 - test_3d_heart_electromechanics (ILLEGAL) 63 - test_3d_particle_relaxation (ILLEGAL) 64 - test_3d_particle_relaxation_single_resolution (ILLEGAL) 67 - test_3d_pkj_lv_electrocontraction_particle_relaxation (ILLEGAL) 68 - test_3d_pkj_lv_electrocontraction (ILLEGAL) 70 - test_3d_self_contact_particle_relaxation (ILLEGAL) 71 - test_3d_self_contact (ILLEGAL) 72 - test_3d_shell_particle_relaxation (ILLEGAL) 73 - test_3d_taylor_bar_particle_relaxation (ILLEGAL) 74 - test_3d_taylor_bar (ILLEGAL)

There seems to be a pattern in the tests that fails, but what do they have in common ?

Xiangyu-Hu commented 1 year ago

Does this only happen in linux system or in all systems?

FabienPean-Virtonomy commented 1 year ago

On Linux only

Xiangyu-Hu commented 1 year ago

if it is the case, we may first simply deactivate ccache in linux. I have seen the same complain in Ccache github issue last year already, However, it seems that the developer is not interesting find a solution yet, https://github.com/ccache/ccache/issues/824

FabienPean-Virtonomy commented 1 year ago

As mentioned in a previous discussion, this problem occurred already without ccache, https://github.com/Virtonomy/SPHinXsys/actions/runs/3592043737/jobs/6047298762. Besides, the flags are laid out in the build command which avoids the problem you mention.

Xiangyu-Hu commented 1 year ago

Ok. I remember that never happened before the revamping of the CMAKE. Or, it is a issue after eigen introduced?

FabienPean-Virtonomy commented 1 year ago

Eigen and CMake revamp came one after another. They are not enough samples of CI runs between Eigen and the CMake revamp to rule out an Eigen cause. What we can infer for now is that it is a SIMD vectorization issue problem and it seems to affect the same examples.

Xiangyu-Hu commented 1 year ago

Ok. These samples are 3d solid dynamics which they have higher chance to use simd operations.