Closed tkoskela closed 2 years ago
I'm trying to test the hipSYCL compiled version on the Cambridge Icelake nodes but I'm getting a Segmentation fault (core dumped)
error message. Steps to try and reproduce it:
I cloned the repo and git switch tk/hipsycl
and I requested an Icelake node with srun -t 00:30:00 -A DIRAC-DR004-CPU --nodes=1 --exclusive -p icelake --pty bash
. Then:
cd openqcd-oneapi/tests/cuda2/dpct_output
make -f Makefile.csd3 hip_omp_cpu
module purge
module load rhel8/default-amp
module load hipsycl/0.9.2/gcc-9.4.0-jg2gfgh
module load gcc/9.4.0
./main.hip_omp_cpu 16 16 16 16 ../../../data/
and I get:
List of detected devices:
hipSYCL OpenMP host device
Selected device: hipSYCL OpenMP host device
Time for AoS to SoA for pauli m +H2D (GPU) (ms): 40.71
Time for AoS to SoA for su3 u +H2D (GPU) (ms): 19.78
Time for AoS to SoA for spinor s +H2D (GPU) (ms): 2.87
Time for cudaMemcpy H2D of lookup tables (ms): 0.53
Time for kernel mul_pauli (ms): 12.15
Segmentation fault (core dumped)
Also minor error on the Makefile.csd3 in lines 37 and 44. You might want to replace module load load
with module load
.
Thank you for the bug report! I had been testing it with the 64 64 64 64 data set and with that it does not segfault. Also rather curiously on GitHub actions it seems to run fine.
Ahh, git lfs does not pull the files by default when you do a git clone. The files in ../../../data
are just pointers that point nowhere, but the code only checks that the files exist and tries to read them as binary files. So on csd3 you need to do
module load git-lfs-2.3.0-gcc-5.4.0-cbo6khp
git lfs pull
to get the actual input files and that will fix the segfaulting
I thought that you used Git LFS only for the bigger files. For some reason I was under the assumption that the "16 16 16 16" data set had been pushed without the LFS and I didn't check further. My bad, but I pulled now everything, re-tested and I can verify that it works as expected. I have also tested the "oneapi_intel_cpu" build and it still produces the correct results.
Approved.
You can approve explicitly by going to the changed files and clicking on Start a Review. Thanks!
Closes #4
After hipsycl 9.2.0 was installed on csd3, we can now build with hipsycl. I've renamed the targets in the Makefile to include both oneapi and hipsycl CPU and NVidia GPU targets. I've also added job scripts to run the 64-64-64-64 test case with these, although the changes are fairly trivial.
I've also added a CI workflow that installs hipSYCL from the University of Heidelberg's repository, builds
main
for a cpu target and runs the 16-16-16-16 test case