Closed Xiangyu-Hu closed 1 year ago
What is the strange illegal instructions which failed the ci? I have run the ctest on local machine and nothing happen?
The -j parameter is not needed for ninja build system. FYI, see tools documentation:
From CMake
If
is omitted the native build tool's default number is used. Some native build tools always build in parallel. The use of
value of 1 can be used to limit to a single job.
From Ninja
Builds are always run in parallel, based by default on the number of CPUs your system has.
Please do not stuff these changes on top of an already overloaded PR. Functionally independent changes should be having separate PR.
What is the strange illegal instructions which failed the ci? I have run the ctest on local machine and nothing happen?
These might be the result of some wrong call to SIMD instructions on a hardware that does not actually support it. It appeared since the shift to Eigen and explicit use of SIMD in rare unpredictable occurrences. Since on CI, the machine changes at each job, it might be related to some coupling of -march=native
with unsupported hardware. It requires a deeper investigation of the logs to find the root cause since it is not easily reproducible.
The -j parameter is not needed for ninja build system. FYI, see tools documentation:
From CMake
If is omitted the native build tool's default number is used.
Some native build tools always build in parallel. The use of value of 1 can be used to limit to a single job.
From Ninja
Builds are always run in parallel, based by default on the number of CPUs your system has.
Please do not stuff these changes on top of an already overloaded PR. Functionally independent changes should be having separate PR.
The -j parameter is not needed for ninja build system. FYI, see tools documentation:
From CMake
If is omitted the native build tool's default number is used.
Some native build tools always build in parallel. The use of value of 1 can be used to limit to a single job.
From Ninja
Builds are always run in parallel, based by default on the number of CPUs your system has.
Please do not stuff these changes on top of an already overloaded PR. Functionally independent changes should be having separate PR.
It seems that j parameter is useful, as you see that new building time is much short than before.
It is shorter because the CI uses ccache to cache compilation artefacts to accelerate subsequent builds. See this run https://github.com/Xiangyu-Hu/SPHinXsys/actions/runs/3604100610/jobs/6076217673 where the build step took 3min, without the -j
It is quite annoying that the illegal instruction frequently shows up. It is possible that the ccache supposes nothing happen but actually the architecture has been changed? Something like this? https://stackoverflow.com/questions/36505275/how-can-i-use-ccache-with-gcc-march-native-across-multiple-architectures
It could be a component of the problem, however, this "Illegal" problem occurred even before the ccache github actions was introduced, see there: https://github.com/Virtonomy/SPHinXsys/actions/runs/3592043737/jobs/6047298762 That's why I think it comes from a -arch=native issue with underlying hardware. The solution of expanding simd flags like in the stackoverflow answer might works.
It could be a component of the problem, however, this "Illegal" problem occurred even before the ccache github actions was introduced, see there: https://github.com/Virtonomy/SPHinXsys/actions/runs/3592043737/jobs/6047298762 That's why I think it comes from a -arch=native issue with underlying hardware. The solution of expanding simd flags like in the stackoverflow answer might works.
Thanks. I have had a look at the solution in stackoverflow but do not know hoe to proceed.
In this new pull request, the modifications are: