DTolm / VkFFT

Vulkan/CUDA/HIP/OpenCL/Level Zero/Metal Fast Fourier Transform library
MIT License
1.48k stars 88 forks source link

Improve LevelZero detection on Ubuntu #145

Closed al42and closed 6 months ago

al42and commented 7 months ago

Intel default installation on Ubuntu puts LevelZero headers in /usr/include/level_zero/ze_api.h.

DTolm commented 6 months ago

Thanks!

al42and commented 6 months ago

BTW, do you mind sharing how close is the ambitious Multiple GPU job splitting? Are you considering only intra-node, or also multi-node decomposition?

DTolm commented 6 months ago

Not close, it is not the goal of my PhD, so I haven't actually spent time on it at all. The decomposition techniques should be similar to the single GPU, I have a rough idea of how the data movement needs to be handled with a level structure similar to shared-global-intranode-multinode. It just will take a lot of effort and time to understand how GPU-to-GPU communications work best for all vendors (unless it is done through simple MPI with host copy, but this is surely bad).

There are also many design questions that I am unsure about, like how data needs to be distributed between nodes before and after the transform. It would have been more productive if it was a more defined request (coming from a vendor, for example) and not just an exploration.

al42and commented 6 months ago

Thanks for the explanation. We (GROMACS) have interest if multi-GPU R2C FFT, but we sadly don't have resources to actively drive the project.