Closed parthe closed 9 months ago
pykeops was successfully installed on my machine (using cuda) and tested using the commands
python -c "import pykeops; pykeops.test_numpy_bindings(); pykeops.test_torch_bindings()"
Hi @parthe
This error doesn't come from keops, but from falkon itself. Did you install it with python setup.py develop
or pip install .
?
I installed using the command pip install git+https://github.com/falkonml/falkon.git
as instructed here
When I install using python setup.py develop
the following log is printed
No CUDA runtime is found, using CUDA_HOME='/home/$USER/.conda/envs/Falkon_ML'
running develop
running egg_info
writing falkon.egg-info/PKG-INFO
writing dependency_links to falkon.egg-info/dependency_links.txt
writing requirements to falkon.egg-info/requires.txt
writing top-level names to falkon.egg-info/top_level.txt
reading manifest file 'falkon.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'falkon.egg-info/SOURCES.txt'
running build_ext
copying build/lib.linux-x86_64-3.10/falkon/c_ext.so -> falkon
copying build/lib.linux-x86_64-3.10/falkon/la_helpers/cyblas.so -> falkon/la_helpers
Creating /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages/falkon.egg-link (link to .)
Adding falkon 0.7.5 to easy-install.pth file
Installed /home/$USER/falkon_ml/falkon
Processing dependencies for falkon==0.7.5
Searching for pykeops@ git+https://github.com/getkeops/keops@ad044a671fdc3c2790b0321f6b9f9b5aa3d220df#subdirectory=pykeops
Reading https://pypi.org/simple/pykeops/
Downloading https://files.pythonhosted.org/packages/8c/9a/ae3931ca85e2a05707d07b0f1d34474939c85e2318335eadb92dd02be3b7/pykeops-2.1.tar.gz#sha256=770894e06b497d9640e04471752ee08e5d936809e571e12db1b4dea03c862457
Best match: pykeops 2.1
Processing pykeops-2.1.tar.gz
Writing /tmp/easy_install-2obkpcv5/pykeops-2.1/setup.cfg
Running pykeops-2.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-2obkpcv5/pykeops-2.1/egg-dist-tmp-_l8xu111
creating /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages/pykeops-2.1-py3.10.egg
Extracting pykeops-2.1-py3.10.egg to /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages
Adding pykeops 2.1 to easy-install.pth file
Installed /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages/pykeops-2.1-py3.10.egg
Searching for keopscore@ git+https://github.com/getkeops/keops@ad044a671fdc3c2790b0321f6b9f9b5aa3d220df#subdirectory=keopscore
Reading https://pypi.org/simple/keopscore/
Downloading https://files.pythonhosted.org/packages/e0/0b/fddeee9a4b5808e8f8bd084804d6a2996096f9a959cb0e54d9b61c5762b3/keopscore-2.1.tar.gz#sha256=15db70dda353fe6b00102b6a9043462bae89f6eea8a9be72426c089096d9d5f0
Best match: keopscore 2.1
Processing keopscore-2.1.tar.gz
Writing /tmp/easy_install-zqpfnlbl/keopscore-2.1/setup.cfg
Running keopscore-2.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-zqpfnlbl/keopscore-2.1/egg-dist-tmp-dk2cmvyg
creating /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages/keopscore-2.1-py3.10.egg
Extracting keopscore-2.1-py3.10.egg to /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages
Adding keopscore 2.1 to easy-install.pth file
Installed /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages/keopscore-2.1-py3.10.egg
Searching for psutil
Reading https://pypi.org/simple/psutil/
Downloading https://files.pythonhosted.org/packages/6d/c6/6a4e46802e8690d50ba6a56c7f79ac283e703fcfa0fdae8e41909c8cef1f/psutil-5.9.1-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=29a442e25fab1f4d05e2655bb1b8ab6887981838d22effa2396d584b740194de
Best match: psutil 5.9.1
Processing psutil-5.9.1-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Installing psutil-5.9.1-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl to /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages
Adding psutil 5.9.1 to easy-install.pth file
Installed /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages/psutil-5.9.1-py3.10-linux-x86_64.egg
Searching for scikit-learn
Reading https://pypi.org/simple/scikit-learn/
Downloading https://files.pythonhosted.org/packages/43/bc/7130ffd49a1cf72659c61eb94d8f037bc5502c94866f407c0219d929e758/scikit_learn-1.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=47464c110eaa9ed9d1fe108cb403510878c3d3a40f110618d2a19b2190a3e35c
Best match: scikit-learn 1.1.1
Processing scikit_learn-1.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Installing scikit_learn-1.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl to /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages
Adding scikit-learn 1.1.1 to easy-install.pth file
Installed /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages/scikit_learn-1.1.1-py3.10-linux-x86_64.egg
Searching for pybind11
Reading https://pypi.org/simple/pybind11/
Downloading https://files.pythonhosted.org/packages/9a/7f/855560aa568e50bea6012ed535e6b8c436e99394f3e5a649d44d2e557242/pybind11-2.10.0-py3-none-any.whl#sha256=6bbc7a2f79689307f0d8d240172851955fc214b33e4cbd7fdbc9cd7176a09260
Best match: pybind11 2.10.0
Processing pybind11-2.10.0-py3-none-any.whl
Installing pybind11-2.10.0-py3-none-any.whl to /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages
Adding pybind11 2.10.0 to easy-install.pth file
Installing pybind11-config script to /home/$USER/.conda/envs/Falkon_ML/bin
Installed /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages/pybind11-2.10.0-py3.10.egg
Searching for threadpoolctl>=2.0.0
Reading https://pypi.org/simple/threadpoolctl/
Downloading https://files.pythonhosted.org/packages/61/cf/6e354304bcb9c6413c4e02a747b600061c21d38ba51e7e544ac7bc66aecc/threadpoolctl-3.1.0-py3-none-any.whl#sha256=8b99adda265feb6773280df41eece7b2e6561b772d21ffd52e372f999024907b
Best match: threadpoolctl 3.1.0
Processing threadpoolctl-3.1.0-py3-none-any.whl
Installing threadpoolctl-3.1.0-py3-none-any.whl to /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages
Adding threadpoolctl 3.1.0 to easy-install.pth file
Installed /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages/threadpoolctl-3.1.0-py3.10.egg
Searching for joblib>=1.0.0
Reading https://pypi.org/simple/joblib/
Downloading https://files.pythonhosted.org/packages/3e/d5/0163eb0cfa0b673aa4fe1cd3ea9d8a81ea0f32e50807b0c295871e4aab2e/joblib-1.1.0-py2.py3-none-any.whl#sha256=f21f109b3c7ff9d95f8387f752d0d9c34a02aa2f7060c2135f465da0e5160ff6
Best match: joblib 1.1.0
Processing joblib-1.1.0-py2.py3-none-any.whl
Installing joblib-1.1.0-py2.py3-none-any.whl to /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages
Adding joblib 1.1.0 to easy-install.pth file
Installed /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages/joblib-1.1.0-py3.10.egg
Searching for numpy==1.22.3
Best match: numpy 1.22.3
Adding numpy 1.22.3 to easy-install.pth file
Installing f2py script to /home/$USER/.conda/envs/Falkon_ML/bin
Installing f2py3 script to /home/$USER/.conda/envs/Falkon_ML/bin
Installing f2py3.10 script to /home/$USER/.conda/envs/Falkon_ML/bin
Using /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages
Searching for scipy==1.8.1
Best match: scipy 1.8.1
Processing scipy-1.8.1-py3.10-linux-x86_64.egg
Adding scipy 1.8.1 to easy-install.pth file
Using /home/$USER/falkon_ml/falkon/.eggs/scipy-1.8.1-py3.10-linux-x86_64.egg
Searching for torch==1.12.0
Best match: torch 1.12.0
Adding torch 1.12.0 to easy-install.pth file
Installing convert-caffe2-to-onnx script to /home/$USER/.conda/envs/Falkon_ML/bin
Installing convert-onnx-to-caffe2 script to /home/$USER/.conda/envs/Falkon_ML/bin
Installing torchrun script to /home/$USER/.conda/envs/Falkon_ML/bin
Using /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages
Searching for typing-extensions==4.1.1
Best match: typing-extensions 4.1.1
Adding typing-extensions 4.1.1 to easy-install.pth file
Using /home/$USER/.conda/envs/Falkon_ML/lib/python3.10/site-packages
Finished processing dependencies for falkon==0.7.5
I dont know why no cuda run-time is found in that folder. I have installed cuda
in that folder and there exists a file named /home/$USER/.conda/envs/Falkon_ML/includa/cuda.h
Usually you need to install CUDA on the whole system, the default place where your runtime would be is: /usr/local/cuda
. To verify CUDA is installed properly the easiest thing is to use pytorch's detection:
python -c 'import torch; print(torch.cuda.is_available())'
I usually install CUDA on the whole system, so I'm not sure what your last comment means. You'll need both the CUDA drivers and the CUDA toolkit (which should match the toolkit with which pytorch has been compiled).
You also may need to add /usr/local/cuda/bin
to your PATH environment variable.
I'm working on a slurm cluster
CUDA is installed, but not in this location: usr/local/bin/
torch.cuda.is_available()
returns True
Here is my install script for falkon:
yes | conda create -n Falkon_ML python=3.10 ipython
conda activate Falkon_ML
yes | conda install -c nvidia/label/cuda-11.3.1 cuda-toolkit
yes | conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
module load cmake
module load gpu cuda
CUDA_PATH='/home/$USER/.conda/envs/Falkon_ML/'
git clone https://github.com/FalkonML/falkon.git falkon_ml
python falkon_ml/falkon/setup.py develop
I am having the exact same problem. pykeops and pytorch work properly. However, when I run
flk = falkon.Falkon(kernel=kernel, penalty=1e-6, M=1000, options=options)
I get 'RuntimeError: Not compiled with CUDA support' same as @parthe first post.
conda install -c "nvidia/label/cuda-11.6.2" cuda-toolkit
conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge
python setup.py develop
. However I got the following error:
` /home/amirhesam/.conda/envs/flk2/bin/nvcc -DTORCH_VERSION_MAJOR=1 -DTORCH_VERSION_MINOR=12 -DTORCH_VERSION_PATCH=1 -DWITH_CUDA -I./falkon/csrc -I/home/amirhesam/.conda/envs/flk2/lib/python3.10/site-packages/torch/include -I/home/amirhesam/.conda/envs/flk2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/amirhesam/.conda/envs/flk2/lib/python3.10/site-packages/torch/include/TH -I/home/amirhesam/.conda/envs/flk2/lib/python3.10/site-packages/torch/include/THC -I/home/amirhesam/.conda/envs/flk2/include -I/home/amirhesam/.conda/envs/flk2/include/python3.10 -c ./falkon/csrc/cuda/square_norm_cuda.cu -o build/temp.linux-x86_64-cpython-310/./falkon/csrc/cuda/square_norm_cuda.o -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr --compiler-options '-fPIC' --expt-relaxed-constexpr --expt-extended-lambda -O2 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=c_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 -std=c++14
In file included from /home/amirhesam/.conda/envs/flk2/lib/python3.10/site-packages/torch/include/ATen/native/cuda/Reduce.cuh:21,
from ./falkon/csrc/cuda/square_norm_cuda.cu:4:
/home/amirhesam/.conda/envs/flk2/lib/python3.10/site-packages/torch/include/ATen/native/cuda/jit_utils.h:11:10: fatal error: ATen/cuda/nvrtc_stub/ATenNVRTC.h: No such file or directory
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated. error: command '/home/amirhesam/.conda/envs/flk2/bin/nvcc' failed with exit code 1 `
pip install git+https://github.com/falkonml/falkon.git
. This works properly on cpu but not on gpu. It returns 'RuntimeError: Not compiled with CUDA support' with GPU.I appreciate it if you could help me with this.
Best,
Hi @ahabedsoltan Yes, it's a known issue with pytorch 1.12 It seems to have been fixed on their part in the master branch but until a new pytorch version is released, your best option is to downgrade to pytorch 1.11.
Sorry for this, Giacomo
Thank you for your prompt answer. Yeah the problem was both torch version and Keops library. I had to switch to pykeops beta version.
yields this error message which I am unable to debug. Please help.