CUDA_PATH not set on Windows

TomographicImaging / CIL

A versatile python framework for tomographic imaging

https://tomographicimaging.github.io/CIL/

Apache License 2.0

98 stars 45 forks source link

CUDA_PATH not set on Windows #1596

Closed effepivi closed 2 months ago

effepivi commented 1 year ago

Hi there,

I got an error thrown by Tigre when I install the environment as follows: environment.yml CUDA_PATH is not set. A work around is to locate where the cudart64_*.dll is for the corresponding Conda environment (You don't have to do it: It's supposed to be in %CONDA_PREFIX%\Library\bin, where %CONDA_PREFIX% is an environment variable that stores the path of the corresponding Conda environment). In the Miniconda Windows Prompt, AFTER you activate the Conda environment, just type : set CUDA_PATH=%CONDA_PREFIX%\Library Then it runs like a charm.

regards,

Franck

casperdcl commented 1 year ago

Surprised that this would be necessary.

Should we document conda env config vars set 'CUDA_PATH=%CONDA_PREFIX%\Library'?

WYVERN2742 commented 1 year ago

This is a side effect of TIGRE relying on the user's CUDA installation rather than linking to cudatoolkit from conda. https://github.com/CERN/TIGRE/blob/master/Python/tigre/__init__.py#L18

A workaround could be using an activation script for conda every time an environment is used (which is cupy's approach): https://github.com/conda-forge/cupy-feedstock/blob/main/recipe/activate.sh

gfardell commented 1 year ago

I'm surprised that conda doesn't set the path automatically when you activate it. I'm pretty sure that's the intended behaviour.

I think adding it to "$CONDA_PREFIX$/etc/conda/activate.d" and "$CONDA_PREFIX$/etc/conda/deactivate.d" is the best short term solution.

I'm not sure if we can do that automatically as a post conda install script?

casperdcl commented 1 year ago

wait so this is a tigre bug. the proper fix is to get conda builds of tigre to use conda cudatoolkit, right? or alternatively the tigre conda package needs (de)activate scripts.

gfardell commented 1 year ago

The conda cudatoolkit package doesn't include NVCC, it's not the equivalent of the actual cuda toolkit. It includes the runtime libraries we need for the binaries and I think acts as some package management maybe it checks the video driver version is compatible too.

We manage the conda build of tigre, but as far as I know this isn't an issue on Linux, and I'd like to confirm it is for all windows users (OS version and conda versions). I'm sure most of our users don't install the full cuda toolkit 10.2.

effepivi commented 1 year ago

Considering that installing the CUDA SDK from Nvidia sets the environment variable (which is how we solved the issue in the past and before I reported it), one might argue that Conda's package of the cudatoolkit should create and set the variable accordingly. However, it may be worth noting that this is not a new issue as it has been picked by other projects in the last few years. There is the cudatoolkit-dev for nvcc (which does not set the variable either, apparently). I haven't found a justification for this behaviour, but there must be one. "activate.d/deactivate.d" within CIL sounds like a good solution.

paskino commented 9 months ago

TIGRE doesn't release the binaries on conda, we created a repo to create the binaries and we host them on the ccpi channel.

I guess it is there that the environment variable should be set.

https://github.com/TomographicImaging/TIGRE-conda

paskino commented 9 months ago

wait so this is a tigre bug. the proper fix is to get conda builds of tigre to use conda cudatoolkit, right? or alternatively the tigre conda package needs (de)activate scripts.

https://docs.conda.io/projects/conda-build/en/stable/resources/activate-scripts.html

gfardell commented 2 months ago

fixed by https://github.com/TomographicImaging/TIGRE-conda/pull/12

conda hosted build of tigre 2.6 contains the fix.