Closed jakirkham closed 1 week ago
Another option is to support $CUDA_HOME
being a list of directories. So for header searching cuda-python would append include
to each entry in $CUDA_HOME
.
Thanks Rob! 🙏
Converted the ideas to an enumerated list and added that idea to the list
cc @bdice
Would it make sense to standardize a tool that uses the path to nvcc to parse information out of nvcc.profile? Something that could be used by anyone, not just cuda-python?
Sorry for lack of response. Is this still an issue? It seems we have been able to build cuda-python on conda-forge?
Yep it is still an issue
We workaround it in conda-forge
Through improvements on the packaging side, we have minimized the workarounds needed ( for example: https://github.com/conda-forge/cuda-python-feedstock/pull/75 ), but we are unable to eliminate them without this fix
@jakirkham I am sorry but I don't get it. In any case we need a way to specify where the CUDA headers are, and setting CUDA_HOME
is the right solution that's been working fine. Are there anything specific to the splayed layout that we have not addressed (either in CUDA Python or in conda-forge)?
Think an ideal solution (from my perspective), would be moving to a CMake-based build system here as CMake already understands how to work with splayed layouts correctly. For example moving to scikit-build-core would very likely solve this issue. This may be desirable anyways to simplify the build process and make it less bespoke
I still don't understand what specific changes are needed to support splayed layout. @jakirkham this issue might have been outdated since the earlier (CTK 12.0) efforts?
Right now, all headers are expected by CUDA Python to locate in one place, for the purposes of both parsing and compiling. So, it doesn't really matter if headers are in $PREFIX/include
or $PREFIX/targets/<platform>/include
or /usr/local/cuda/include
or any place in the host environment, as long as $CUDA_HOME
is set to take a single directory path. Can we please identify which part of CUDA Python is supposed to find headers locating in two or more places (which we'd fail to do so)? I've eyeballed the recent cuda-python recipe and I was unable to find such traces anymore, most likely they were removed starting https://github.com/conda-forge/cuda-python-feedstock/pull/58 and further cleaned up later.
Also, there is no C++ components in this project (at least not yet; there will be once we start working on memory management but it would not happen before this Fall the earliest) and we need to balance with the currently limited engineering resource we have. I don't see the immediate benefit of moving away from setuptools to CMake (+ something that understands CMake like scikit-build-core) when things are already working. That said, if I can enlist a RAPIDS build expert to help with the build system migration, I would not object the change to CMake 😉
Nope. This issue was opened because of issues encountered adding CUDA 12 support
Yes CUDA-Python doesn't support splayed layouts. We agree 🙂
The issue is the CUDA compiler and tightly coupled headers and libraries live together. In Conda this means $BUILD_PREFIX
(the place where requirements/build
are installed). Meanwhile headers and libraries the package uses (IOW requirements/host
) are installed to $PREFIX
. So $BUILD_PREFIX != $PREFIX
, but CUDA Python needs components from both and doesn't know how to handle this
As the wheel ecosystem continues to build out CUDA library wheels, it will likely wind up in the same situation with the same problems
Hopefully that explanation makes more sense. Please feel free to ask more questions
but CUDA Python needs components from both and doesn't know how to handle this
Could you tell me where this statement comes from? Sorry but I just don't think we are getting the points across and we don't understand each other through this media. Please feel free to arrange an offline meeting.
but CUDA Python needs components from both and doesn't know how to handle this
Could you tell me where this statement comes from? Sorry but I just don't think we are getting the points across and we don't understand each other through this media. Please feel free to arrange an offline meeting.
I expect the problem is this: https://github.com/NVIDIA/cuda-python/blob/81e6f866bb0c393e43f422e65d57678d40bcceb4/examples/common/common.py#L22
Right like this line
https://github.com/NVIDIA/cuda-python/blob/2f9d31cfacc818da0cd4569dd47ea43c5724d2a0/setup.py#L81
Though am now noticing that in CUDA-Python 12.4.0 there may have been related changes for splayed layouts
https://github.com/NVIDIA/cuda-python/blob/2be0aac1a5cac84fec4137aa9c50525653425352/setup.py#L30
https://github.com/NVIDIA/cuda-python/blob/2be0aac1a5cac84fec4137aa9c50525653425352/setup.py#L77
So maybe we should give this a try to see if it helps
Ok made some changes to the conda-forge build to take advantage of the changes in how CUDA_HOME
is handled. This is in PR: https://github.com/conda-forge/cuda-python-feedstock/pull/83
If someone could take a look, that would be appreciated 🙂
Currently
cuda-python
relies on all binaries (likenvcc
), all headers, and all libraries to live in a single directory (specified by$CUDA_HOME
or similar).However there are use cases (like cross-compilation, as with conda-build) where the build tools may live in one location (and perform builds on that architecture) whereas the headers and libraries may live in a different location (and target a different architecture). In this case not everything lives in
$CUDA_HOME
.It would be helpful to have a way of specifying where these different components come from. Here are some options:
$NVCC
for thenvcc
location$CUDA_BIN
(if specified) to get build tool directory$CUDA_HOME
Maybe there are other reasonable options worth considering?