Closed ivan-marroquin closed 1 year ago
Hi,
Thanks for reporting the issue. It seems Ninja can't find Python.
cl
, by performing where cl
in a command line terminal? (it should return the location of cl
)Hi @elephaint
Thanks for your prompt reply. To answer your questions: 1) I have Build Tools for Visual Studio installed 2) From a DOS terminal the command "where cl" reports: C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\bin\Hostx64\x64\cl.exe 3) I also added the following environment variables: LIB -> C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\lib\x64
Include -> C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include
Path -> C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\bin\Hostx64\x64
4) I am not using a virtual environment
Hope this helps,
Ivan
Hi @elephaint
I tried the following:
Unfortunately, I still get the same error message when I run "from pgbm import PGBM".
Ivan
Hi @ivan-marroquin,
It remains strange, and it seems Ninja can't find your Python installation. It seems like your Python installation is located in a temporary folder, so I'd suggest to use a virtual environment manager like Conda to setup Python.
Hi @ivan-marroquin, I think it's because of the version of pytorch or maybe the version of cl. When I use the latest version of pytorch, I face the same problem. But when I experiment with the following version of pytorch, I succeed in sagemaker studio lab whose os is linux. However, when I use the same setting on Windows, the problem still exists. I think maybe it's also related to the version of cl.
Hi @elephaint and @ACommunist
Many thanks for your support and suggestions. On my case, I have to use Cuda 11 which in turn forces me to stay in latest compatible version of pytorch.
With respect to "cl", I tried both visual studio 2019 and 2022. In both occasions, I got the same error message.
Ivan
@ivan-marroquin in the case of incompatibility issues I'd strongly suggest to take the virtual environment (i.e. conda) route, because then Pytorch will install its own CUDA toolkit version and you can still use your Windows generic CUDA toolkit for other projects. The steps would thus be to install Anaconda, open up an Anaconda shell and execute the following commands:
conda create -n new_env
conda activate new_env
conda install pytorch pytorch-cuda=11.7 -c pytorch -c nvidia
pip install pgbm
After that, it should work (you should install other dependencies to run the examples, e.g. matplotlib, separately).
Also, if you don't like Anaconda, use miniconda, which is much lighter and basically contains everything you need.
Hi @elephaint
thanks for the suggestions!
Ivan
@ivan-marroquin Did you get it to work?
Hi @elephaint ,
I have to talk with developers first. The developing/testing of Python code is set in a way to make use of the generic Cuda install plus we do not use Anaconda (or miniconda). Thus, I have to rely on pip for the installation of packages.
Thanks for everything, Ivan
@ivan-marroquin Ok, that's unfortunate. I've created an extension to Scikit-learn's HistGradientBoostingRegressor that enables PGBM too; this would solve your issues. I'll let you know when that becomes available (it's currently filed as a merge request with scikit-learn; ideally it will be part of scikit-learn, but if that is not possible I'll publish the same method through the pgbm package).
Hi @elephaint
That is great news. Many thanks for sharing this info.
Ivan
Hi @elephaint I have the same error on linux
I installed pytorch as you suggested
conda create -n new_env
conda activate new_env
conda install pytorch pytorch-cuda=11.7 -c pytorch -c nvidia
pip install pgbm
Then, I installed pgbm as
pip install pgbm
Finally, I get this error during pgbm import
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exist status 1
Describe the bug I have Python 3.8.10 on windows 10 machine. I installed Cuda 11.0. To install pytorch, I used this command: pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
Note that the pytorch installation seems to be fine since the command "torch.cuda.is_available()" returns True.
Then, I proceed with the installation of PGBM using pip.
When, I run this command "from pgbm import PGBM". I get the following error messages: Detected CUDA files, patching ldflags Emitting ninja build file C:\Users\imarroquin\AppData\Local\torch_extensions\torch_extensions\Cache\py38_cu113\split_decision\build.ninja... Building extension module split_decision... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/3] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin\bin\nvcc --generate-dependencies-with-compile --dependency-output splitgain_kernel.cuda.o.d -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=split_decision -DTORCH_API_INCLUDE_EXTENSION_H -IC:\Temp\Python_3.8.10\lib\site-packages\torch\include -IC:\Temp\Python_3.8.10\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Temp\Python_3.8.10\lib\site-packages\torch\include\TH -IC:\Temp\Python_3.8.10\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin\include" -IC:\Temp\Python_3.8.10\Include -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_52,code=compute_52 -gencode=arch=compute_52,code=sm_52 -c C:\Temp\Python_3.8.10\lib\site-packages\pgbm\splitgain_kernel.cu -o splitgain_kernel.cuda.o FAILED: splitgain_kernel.cuda.o C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin\bin\nvcc --generate-dependencies-with-compile --dependency-output splitgain_kernel.cuda.o.d -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=split_decision -DTORCH_API_INCLUDE_EXTENSION_H -IC:\Temp\Python_3.8.10\lib\site-packages\torch\include -IC:\Temp\Python_3.8.10\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Temp\Python_3.8.10\lib\site-packages\torch\include\TH -IC:\Temp\Python_3.8.10\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin\include" -IC:\Temp\Python_3.8.10\Include -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_52,code=compute_52 -gencode=arch=compute_52,code=sm_52 -c C:\Temp\Python_3.8.10\lib\site-packages\pgbm\splitgain_kernel.cu -o splitgain_kernel.cuda.o CreateProcess failed: The system cannot find the file specified. ninja: fatal: ReadFile: The handle is invalid.
Traceback (most recent call last): File "C:\Temp\Python_3.8.10\lib\site-packages\torch\utils\cpp_extension.py", line 1808, in _run_ninja_build subprocess.run( File "C:\Temp\Python_3.8.10\lib\subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "", line 1, in
File "C:\Temp\Python_3.8.10\lib\site-packages\pgbm__init__.py", line 1, in
from .pgbm import PGBM, PGBMRegressor
File "C:\Temp\Python_3.8.10\lib\site-packages\pgbm\pgbm.py", line 41, in
load(name="split_decision",
File "C:\Temp\Python_3.8.10\lib\site-packages\torch\utils\cpp_extension.py", line 1202, in load
return _jit_compile(
File "C:\Temp\Python_3.8.10\lib\site-packages\torch\utils\cpp_extension.py", line 1425, in _jit_compile
_write_ninja_file_and_build_library(
File "C:\Temp\Python_3.8.10\lib\site-packages\torch\utils\cpp_extension.py", line 1537, in _write_ninja_file_and_build_library
_run_ninja_build(
File "C:\Temp\Python_3.8.10\lib\site-packages\torch\utils\cpp_extension.py", line 1824, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'split_decision'
To Reproduce Steps to reproduce the behavior:
Expected behavior No error message(s) when import PGBM package
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Smartphone (please complete the following information):
Additional context Add any other context about the problem here.