Closed johnrollinson closed 1 year ago
(Sorry, closed by accident!)
Thanks so much for posting this and great job digging around @johnrollinson!
The hardcoded path is indeed at https://github.com/spinsphotonics/fdtdz/blob/d7a174ea179039f83839d432cd4758adf41cea5e/cuda/CMakeLists.txt#L137C30-L137C38 and really should be fixed. I'll keep this open until that happens.
Also, I'll look to add a link to your super-helpful post on the README.md.
Thanks again!!
First, thank you very much for creating this package, I have been looking for something like this for a long time and am very excited to experiment with fdtdz.
I just wanted to post this here in case anyone else runs into the same issue.
I'm using fdtdz on an HPC cluster with Tesla V100's with CUDA driver version 470.57.02 and CUDA runtime version 11.4.
I initially installed fdtdz from pip with no installation issues, but when I ran the demo notebook I got the following error during the simulation step:
CUDA_ERROR_UNSUPPORTED_PTX_VERSION (the provided PTX was compiled with an unsupported toolchain.) in /tmp/pip-install-jpn5veal/fdtdz_eba4edc6d1564f1bb2469f3b1ec6c305/cuda/kernel_precompiled.h:119
I did some looking around online and it seems like this error is usually related to the CUDA driver version (e.g. here, using an older driver with software compiled for a newer CUDA version). Since I'm on an HPC cluster, updating the CUDA driver is not really an option for me (at least not an easy one ...). I dug around some more and saw that fdtdz uses some pre-compiled PTX kernels so I figured I'd try recompiling with my CUDA toolchain version.
Here are the steps I used:
git clone https://github.com/spinsphotonics/fdtdz.git
cd fdtdz && rm -r src/fdtdz_jax/ptx
cd cuda && mkdir build && cd build && cmake .. && make -j && ctest --verbose
cd .. && cp -r ptx ../src/fdtdz_jax/ptx
cd .. && pip install -e .
After this I was able to run the demo notebook without any issues. Total wall time was 47.3s running the simulation on a Tesla V100, so it seems like performance has not been affected by using 11.4 :+1: