Closed mkstoyanov closed 1 year ago
@G-Ragghianti Do we have an actual problem or spack simply hasn't been updated yet.
Since the spack PR was accepted, the CUDA memory problem should be gone.
The fix that you proposed for the cuda backend problem has not been merged yet into spack develop. I'm still working to verify that it works to resolve our issue. However, that particular check can be ignored for the purposes of this PR. Do you want to include a ONEAPI on gpu_intel check along with this PR?
Yes, I need to check Intel GPU for this PR. It may be good to do it a few times since the user reported a race condition that doesn't appear every time. I can manually tell the test to run again, but it has to run on the actual hardware.
I'm not bothered by the CUDA+spack problem, just drop me a line when you merge the spack fix, so I know and don't ignore a legit problem.
PS: My CUDA+spack fix has been merged. The pending PR is for your MPI_HOME
fix (I don't understand why they are taking so long).
To add the ONEAPI GPU check back in, you'd modify the "heffte_gpu" section of the main.yaml file to be the following:
heffte_gpu:
strategy:
matrix:
maker: [cmake, spack]
device: [gpu_nvidia, gpu_intel]
exclude: # spack package doesn't support gpu_intel
- maker: spack
device: gpu_intel
fail-fast: false
runs-on: ${{matrix.device}}
timeout-minutes: 30
steps:
- uses: actions/checkout@v3
- name: Build
run: .github/workflows/build-${{matrix.maker}}.sh build ${{matrix.device}}
- name: Test
run: .github/workflows/build-${{matrix.maker}}.sh test ${{matrix.device}}
- name: SmokeTest
run: .github/workflows/build-${{matrix.maker}}.sh smoketest ${{matrix.device}}
Just added it and I'm getting the same conflict that we had before, the OneAPI version coming with spack is not compatible with gcc-9.5.
How do we resolve that? Can we just run it in the oneapi container, that would be the proper way instead of relying on the spack installation.
@G-Ragghianti we can close this one for now, but I will still need access to Intel GPU, either with a manual account or as part of the CI. Both work for me.
Just added it and I'm getting the same conflict that we had before, the OneAPI version coming with spack is not compatible with gcc-9.5.
How do we resolve that? Can we just run it in the oneapi container, that would be the proper way instead of relying on the spack installation.
Currently, the CI is loading whichever intel-oneapi-* module is installed on the container. You could in theory modify the section of build-cmake.sh to use the system-installed oneapi tools, but I would prefer to modify build-cmake.sh so that it will load a specific version that you need. Right now, it is using intel-oneapi-compilers@2023.0.0. What version do you need?
The intel-oneapi-compiler
is the correct version, we have to use the most recent one as things keep changing a lot.
However, the compiler is complaining about something in the STL library provided by gcc-9.5. The conflict has nothing to do with heFFTe, it's in the system files. My guess is that oneAPI 2023 requires newer version of gcc.
I don't know how the cpu-only oneAPI version works, we had the error at one point, then I think it went away. I don't know what changed in the setting but we can compare to that one to see.
We can load a different version of gcc (which?). I'm surprised that intel-oneapi-* would have a problem with @.*** since that is the version of gcc that the intel-oneapi package was installed with (spack should have noted the incompatibility).
On 3/23/23 16:44, Miroslav Stoyanov wrote:
The |intel-oneapi-compiler| is the correct version, we have to use the most recent one as things keep changing a lot.
However, the compiler is complaining about something in the STL library provided by gcc-9.5. The conflict has nothing to do with heFFTe, it's in the system files. My guess is that oneAPI 2023 requires newer version of gcc.
I don't know how the cpu-only oneAPI version works, we had the error at one point, then I think it went away. I don't know what changed in the setting but we can compare to that one to see.
— Reply to this email directly, view it on GitHub https://github.com/icl-utk-edu/heffte/pull/17#issuecomment-1481871399, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7Q5Y5X7SF3MPBFANNQXM3W5SY3XANCNFSM6AAAAAAWELXWFE. You are receiving this because you were mentioned.Message ID: @.***>
gcc-11 should work fine.
Compatibility with STL is an issue across platforms and few people care about it.