[ENH]: Improve/Rewrite `cuda.parallel`'s build system

leofang commented 2 weeks ago

Building cuda.parallel is quite brittle due to requirements from the C library. Through some patient trials and errors I discovered the following build-time dependencies are required:

gcc 13+ (due to the usage of std::format)
CUDA 12.5+ (due to the usage of gcc 13, which is otherwise unresolvable with conda-forge CUDA packages)
make (this is a quirk of conda-forge)
libnvjitlink-dev (needed to pass CMake checks)
cuda-nvrtc-dev (needed to pass CMake checks)
lit (needed to pass CMake checks)

Also I have to set the env var CUDAARCHS like CUDAARCHS="86;89" pip install -v . so that CMake knows which archs to build for, otherwise CMake also complains.

Suggestions:

Migrate the build system from setuptools + custom build commands to more CMake-friendly scikit-build-core
Add build-time requirements to the section in pyproject.toml for scikit-build-core to pick up
Skip unnecessary CMake checks if only building the C library
Figure out a way to build also for CUDA 11 (for which the underlying C library is supposed to work) so as to support deployments based on CUDA minor version compatibility
Set a default CUDA arch value and allow users to overwrite
Test conda builds in the CI like the RAPIDS projects do
Expand the installation section to add instructions for building from source

leofang commented 2 weeks ago

I noted that due to the way we look up headers at run time, editable install (pip install -e .) does not work.

miscco commented 2 weeks ago

I have opened a PR to address:

[ ] gcc 13+ (due to the usage of std::format)
[ ] CUDA 12.5+ (due to the usage of gcc 13, which is otherwise unresolvable with conda-forge CUDA packages)

I believe we can get rid of

[ ] make (this is a quirk of conda-forge)
[ ] lit (needed to pass CMake checks)

AFAIK we need

[ ] libnvjitlink-dev (needed to pass CMake checks)
[ ] cuda-nvrtc-dev (needed to pass CMake checks)

gevtushenko commented 2 weeks ago

Issue tracking CCCL headers installation in cuda.parallel: https://github.com/NVIDIA/cccl/issues/2281 Issue tracking CMake improvements for CCCL/c: https://github.com/NVIDIA/cccl/issues/2235

NVIDIA / cccl

[ENH]: Improve/Rewrite `cuda.parallel`'s build system #2334