Closed leofang closed 8 months ago
@leofang Hello, I can provide how my usage situation is affected by the behavior setting CUDA_PATH
to $CONDA_PREFIX
. I install both CuPy and PyTorch in one virtual environment. I want to build a PyTorch extension, but it will fail due to overriding the CUDA_PATH
environment variable will make PyTorch fail to find nvcc
.
https://github.com/pytorch/pytorch/blob/v2.1.0/torch/utils/cpp_extension.py#L395-L400
def _check_cuda_version(compiler_name: str, compiler_version: TorchVersion) -> None:
if not CUDA_HOME:
raise RuntimeError(CUDA_NOT_FOUND_MESSAGE)
nvcc = os.path.join(CUDA_HOME, 'bin', 'nvcc')
cuda_version_str = subprocess.check_output([nvcc, '--version']).strip().decode(*SUBPROCESS_DECODE_ARGS)
...
where the value of the CUDA_HOME
variable is determined by the following function:
https://github.com/pytorch/pytorch/blob/v2.1.0/torch/utils/cpp_extension.py#L93C1-L121
def _find_cuda_home() -> Optional[str]:
...
cuda_home = os.environ.get('CUDA_HOME') or os.environ.get('CUDA_PATH')
if cuda_home is None:
...
if cuda_home and not torch.cuda.is_available():
print(f"No CUDA runtime is found, using CUDA_HOME='{cuda_home}'",
file=sys.stderr)
return cuda_home
I think you can just set an env var CUDA_HOME
pointing to your local CUDA installation.
Possible options:
export CUDA_PATH=$CONDA_PREFIX/targets/${targetDir}
set CUDA_PATH=$CONDA_PREFIX\Library
(#232)If we drop the activation scripts, would it make sense to have some way to silence this CuPy warning about CUDA_PATH
?
Can understand that warning in the general case. However for Conda packaging with a known layout, wonder if that warning is more confusing to users (and whether users setting it would cause hard to diagnose issues with Conda packages)
We can certainly do that. (We can patch it here for v12, and upstream the patch for v13.) @kmaehashi Do I miss anything?
FWIW Numba uses this trick to check for a Conda environment. Maybe CuPy could leverage similar logic?
Oh we already have a detection mechanism via reading the json file shipped with the conda cupy package, so detecting whether inside a conda env is not a problem.
(btw why wouldn't Numba just check if the env var CONDA_PREFIX
is set?)
The logic was added in PR ( https://github.com/numba/numba/pull/4096 ). Though don't see a mention of this consideration
If I had to guess, it is so that it will work even if the environment is not activated
If we drop the activation scripts, would it make sense to have some way to silence this CuPy warning about
CUDA_PATH
?
The warning in question is to notify users about (possibly) missing CUDA shared library installation on Windows. In conda environment this should not happen as dependencies are satisfied by the package manager. I think it's fine to disable this warning in conda environment. -> I've made a PR for this: #8051
Despite that, I guess it's better to avoid modifying CUDA_PATH
in the activation scripts because CuPy uses $CUDA_PATH/bin/nvcc
as the default nvcc path (e.g. used in RawModule(backend='nvcc')
)
@leofang encouraged me to comment here.
In multi-stage build pipelines where conda
is used to manage dependencies in conjunction with other (build) tools, I have to reset CUDA_PATH
because it is corrupted by package installation.
What is the primary reason CUDA_PATH
is set by cupy
?
FWIW the warning referenced above was removed in CuPy 13 ( https://github.com/cupy/cupy/pull/8076 )
Are there any other action items before making this change?
@leofang encouraged me to comment here. In multi-stage build pipelines where
conda
is used to manage dependencies in conjunction with other (build) tools, I have to resetCUDA_PATH
because it is corrupted by package installation.What is the primary reason
CUDA_PATH
is set bycupy
?
Not trying to be noisy but wanted to capture more detail.
If in the middle of a build, cupy
is installed, and the associated environment is active, then CUDA_PATH
needs to be reset for follow-up compilation where CUDA_PATH
is used.
Likewise, if another library built with CFFI bindings (or similar) relies on the environment variable CUDA_PATH
being set, then this library will be broken alongside cupy
by default.
Thanks all, the new build has the activation scripts purged. @mtjrider would it be possible if you retry your workflow with the latest cupy
from conda-forge?
Wonder if we should mark the previous CuPy 13.0.0 builds broken so that users only get packages without the activation script when installing CuPy 13.0.0
Edit: After offline discussion with @leofang and @mtjrider , we thought this would be a reasonable change. Filed issue ( https://github.com/conda-forge/cupy-feedstock/issues/255 ) to track and document next steps
Currently we point
CUDA_PATH
to the conda env prefix https://github.com/conda-forge/cupy-feedstock/blob/1788a9c9970bbf09e783d64e9fede420cb32b18f/recipe/activate.sh#L8 because we wanted to mimic a real CTK existing somewhere that's visible at run time to CuPy. The activation (deactivation) script is called when the conda env containing a CuPy package is activated (deactivated), during whichCUDA_PATH
is set (unset). However, this trick is only applicable to the "old" layout (pre-CUDA 12). For CUDA 12, this would likely not work very well. I think we'd need something likeexport CUDA_PATH=$CONDA_PREFIX/targets/${targetDir}
at least, for the new layout.Also, in the past I've occasionally heard complaints about us overwriting
CUDA_PATH
for users. It may prompt the consideration of dropping it all together. But unfortunately I do not have time to investigate the consequence of doing so.cc: @jakirkham @kmaehashi