Closed mkstoyanov closed 1 year ago
@G-Ragghianti let's see if this fixes the GPU memory problem.
If this fixes it, I don't want to merge this. Instead I want to fix the CTEST_PARALLEL_LEVEL
variable in the container.
It looks like it passed. Can we do it by calling cmake with "-DCTEST_PARALLEL_LEVEL=1"? I can add this to the CI script, but I think it would maybe make sense to implement it in the CMakeLists.txt just like you did. Otherwise, other people will have the same problem and will have to discover on their own to set this. It doesn't seem like anyone would have a problem with the ctest being serial by default. Thoughts?
CTEST_PARALLEL_LEVEL
is an environmental variable, i.e., it has to be set with export CTEST_PARALLEL_LEVEL=1
or unset CTEST_PARALLEL_LEVEL
in bash, CMake will have no effect.
All tests default to parallel and even in the case of heFFTe, it would probably be OK if we run together tests with 2 and 4 ranks, the problem will happen when we get 12 ranks together. The issue is also restricted to the GPU usage, the CPU tests don't have that problem.
I really want to know whether the variable is being set by you or by spack, the latter is a big issue and I would have to complain somewhere.
I can add a variable Heffte_SEQUENTIAL_TESTING
and have the workaround that way.
Nothing of mine is setting CTEST_PARALLEL_LEVEL and there is only two references to it in spack which don't appear to have an effect of setting the value (see spack/lib/spack/spack/build_systems/cmake.py). Since spack is running "make test" to run the tests (instead of ctest directly), maybe it is just caused by the inherent parallelism of make?
I have tested make test -j
and that doesn't result in parallel tests. You have to do either make test ARGS=-j4
or ctest -j4
or set the environment variable. Spack is purposely forcing the variable instead of respecting the general system setup.
This is bad behavior, but not the first for spack
So I added the option Heffte_SEQUENTIAL_TESTING
but I want to have it off by default, so the test will fail now.
We should simply add -DHeffte_SEQUENTIAL_TESTING=ON
to spack.
How do you want to add this to the spack package: