Closed NisuSan closed 7 months ago
Thanks for using LightGBM and for the thorough write-up!
As explained in the documentation, you cannot simple pip install ./python-package
in this repo. Building the Python package from source is driven by a shell script.
To build the GPU Python package from GitHub sources, do the following:
git clone --recursive https://github.com/microsoft/LightGBM
cd ./LightGBM
sh build-python.sh install --gpu
But only do that if you need an unreleased version of lightgbm
. If you are ok using a released version, install from PyPI... the Windows wheels we distributed have the OpenCL-based GPU support (not CUDA) already compiled in.
pip install lightgbm
For more details, see https://stackoverflow.com/a/77078844/3986677
I tried to run pip install \ --force-reinstall \ --no-binary lightgbm \ --config-settings=cmake.define.USE_CUDA=ON \ lightgbm
according to https://stackoverflow.com/a/77078844/3986677 and got error "ERROR: Failed building wheel for lightgbm. ERROR: Could not build wheels for lightgbm, which is required to install pyproject.toml-based projects".
After that I tried to use simplified version of command and just run the pip install lightgbm --config-settings=cmake.define.USE_CUDA=ON
and packege installed well, but when I tried to set { 'device_type': 'cuda' }
in my script, I got error: "Trial 0 failed with parameters: {'feature_fraction': 0.6} because of the following error: LightGBMError('CUDA Tree Learner was not enabled in this build.\nPlease recompile with CMake option -DUSE_CUDA=1')"
UPD
I tried to install package from local repo using sh build-python.sh install --gpu
and it works, but only with { 'device_type': 'gpu' }
, not "cuda". What exactly difference between this two options?
UPD 2
I tried sh build-python.sh install --cuda
too and its failed with "CMake build failed ERROR Backend subprocess exited when trying to invoke build_wheel"
got error "ERROR: Failed building wheel for lightgbm. ERROR: Could not build wheels for lightgbm, which is required to install pyproject.toml-based projects".
That error has many possible causes. I strongly suspect that there were more logs than just that printed, which might help us to help you identify the root cause.
Can you please run this again:
pip install \
--force-reinstall \
--no-binary lightgbm \
--config-settings=cmake.define.USE_CUDA=ON \
lightgbm
And share the full output that's printed?
What exactly difference between this two options?
-DUSE_GPU=ON
or --gpu
) = OpenCL-based GPU-accelerated version of LightGBM. Use this for non-NVIDIA GPUs.-DUSE_CUDA=ON
or --cuda
) = CUDA-based GPU-accelerated version of LightGBM. Use this for NVIDIA GPUs.See https://lightgbm.readthedocs.io/en/latest/Installation-Guide.html#build-cuda-version for more information.
In case you're new to GitHub... please see https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax for some tips on how to format text here in a way that makes the difference between code, output from code, and your own words clearer.
And share the full output that's printed?
Sure, lightgbm.log
Thank you.
I see compilation errors like this:
"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\nvcc.exe" --use-local-env -ccbin "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.39.33519\bin\HostX64\x64" -x cu -rdc=true -I"C:\Users\Antony\AppData\Local\Temp\pip-install-1rpnm3ee\lightgbm_1ddc53f59bc64e7d810b3ed1e35f19a2\external_libs\eigen" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" -I"C:\Users\Antony\AppData\Local\Temp\pip-install-1rpnm3ee\lightgbm_1ddc53f59bc64e7d810b3ed1e35f19a2\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" --keep-dir lightgbm_objs\x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_89,code=sm_89 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -O3 -lineinfo -Xcompiler="/EHsc -openmp -fPIC -Ob2" -D_WINDOWS -DNDEBUG -DEIGEN_MPL2_ONLY -DEIGEN_DONT_PARALLELIZE -DUSE_SOCKET -DUSE_CUDA -DWIN_HAS_INET_PTON -D"CMAKE_INTDIR=\"Release\"" -D_MBCS -DWIN32 -D_WINDOWS -DNDEBUG -DEIGEN_MPL2_ONLY -DEIGEN_DONT_PARALLELIZE -DUSE_SOCKET -DUSE_CUDA -DWIN_HAS_INET_PTON -D"CMAKE_INTDIR=\"Release\"" -Xcompiler "/EHsc /Wall /nologo /O2 /FS /MD /GR" -Xcompiler "/Fdlightgbm_objs.dir\Release\lightgbm_objs.pdb" -o lightgbm_objs.dir\Release\/src/treelearner/cuda/cuda_best_split_finder.cu.obj "C:\Users\Antony\AppData\Local\Temp\pip-install-1rpnm3ee\lightgbm_1ddc53f59bc64e7d810b3ed1e35f19a2\src\treelearner\cuda\cuda_best_split_finder.cu"
cl : Command line warning D9002: ignoring unknown option '-fPIC' [C:\Users\Antony\AppData\Local\Temp\tmpnc1sf10k\build\lightgbm_objs.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\crt/host_config.h(104): warning C4668: '__NV_NO_HOST_COMPILER_CHECK' is not defined as a preprocessor macro, replacing with '0' for '#if/#elif'
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.3/include\cuda.h(3180): warning C4668: '__STDC_VERSION__' is not defined as a preprocessor macro, replacing with '0' for '#if/#elif'
C:/Users/Antony/AppData/Local/Temp/pip-install-1rpnm3ee/lightgbm_1ddc53f59bc64e7d810b3ed1e35f19a2/include\LightGBM/utils/common.h(33): warning C4464: relative include path contains '..' [C:\Users\Antony\AppData\Local\Temp\tmpnc1sf10k\build\lightgbm_objs.vcxproj]
C:/Users/Antony/AppData/Local/Temp/pip-install-1rpnm3ee/lightgbm_1ddc53f59bc64e7d810b3ed1e35f19a2/include\LightGBM/utils/common.h(34): warning C4464: relative include path contains '..' [C:\Users\Antony\AppData\Local\Temp\tmpnc1sf10k\build\lightgbm_objs.vcxproj]
cl : Command line warning D9002: ignoring unknown option '-fPIC'
...
C:\Users\Antony\AppData\Local\Temp\pip-install-1rpnm3ee\lightgbm_1ddc53f59bc64e7d810b3ed1e35f19a2\src\treelearner\cuda\cuda_best_split_finder.cu(1937): error : identifier "LightGBM::kMinScore" is undefined in device code
C:\Users\Antony\AppData\Local\Temp\pip-install-1rpnm3ee\lightgbm_1ddc53f59bc64e7d810b3ed1e35f19a2\src\treelearner\cuda\cuda_best_split_finder.cu(1966): error : identifier "LightGBM::kMinScore" is undefined in device code
... dozens more like that ...
Error limit reached.
100 errors detected in the compilation of "C:/Users/Antony/AppData/Local/Temp/pip-install-1rpnm3ee/lightgbm_1ddc53f59bc64e7d810b3ed1e35f19a2/src/treelearner/cuda/cuda_best_split_finder.cu".
Compilation terminated.
So just to confirm... that output came from running precisely this command, with no other customizations?
pip install \
--force-reinstall \
--no-binary lightgbm \
--config-settings=cmake.define.USE_CUDA=ON \
lightgbm
the preproduction repo and describe the steps I did
The error message you're reporting there is this:
"[LightGBM] [Fatal] CUDA Tree Learner was not enabled in this build. Please recompile with CMake option -DUSE_CUDA=1"
And you did not compile the library with -DUSE_CUDA=1
.
cmake -DUSE_GPU=1 -DOpenCL_LIBRARY=/usr/local/cuda/lib64/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/local/cuda/include/ ..
If you want to use {"device": "cuda"}
, you have to compile the library with -DUSE_CUDA=1
, exactly as that message says.
So just to confirm... that output came from running precisely this command, with no other customizations?
Yes, no customizations.
And you did not compile the library with -DUSE_CUDA=1
Oh, I see now.
@jameslamb , Finally I compiled the module for CUDA using, but now I got error
LightGBMError: Check failed: (split_indices_block_size_data_partition) > (0) at /usr/local/src/lightgbm/LightGBM/lightgbm-python/src/treelearner/cuda/cuda_data_partition.cpp, line 280 .
I don't google any information about this error..
What specific command(s) did you run or other actions did you take to fix the compilation errors?
What specific command(s) did you run or other actions did you take to fix the compilation errors?
I did it for Docker, not Windows.
-DUSE_GPU=1
by -DUSE_CUDA=1
./build-python.sh install --precompile
by ./build-python.sh install --cuda
Ok. Well it looks like you've opened another issue for the new error message you're reporting (#6329), and the documentation here does explicitly say that Windows support for the CUDA interface is not currently available:
Note: only Linux is supported, other operating systems are not supported yet.
ref: https://lightgbm.readthedocs.io/en/latest/Installation-Guide.html#build-cuda-version
So as it seems you're not interested in continuing to help with identifying the root cause of these issues on Windows, we'll close this.
Issue Description: I encountered difficulties while attempting to install the LightGBM GPU (master branch) Python package on Windows. Despite successfully compiling the GPU version and obtaining the necessary .dll and .exe files in the Release folder, I faced several obstacles during the installation process using the command
pip install ./python-package
.Steps to Reproduce:
Compile LightGBM GPU (master branch) version on Windows. Check Release folder for containing the .dll and .exe files. Execute the command
pip install ./python-package
from root folder (LightGBM)Expected Behavior: The Python package installation process should proceed smoothly without any errors.
Actual Behavior: Encountered errors during the installation process:
Initially failed to locate the LICENSE file within the 'python-package' folder. Manually creating the LICENSE file resolved this issue. Subsequently, encountered an error indicating the absence of the 'CMakeLists.txt' file.
Additional Information:
OS: Windows Compiler: cmake Python version: 3.11.5 LightGBM version: folder cloned from master branch usning git clone --recursive https://github.com/microsoft/LightGBM
Proposed Solution: Investigate and resolve the issues preventing successful installation of the LightGBM GPU Python package on Windows. This may involve ensuring all required files are present and addressing any potential compatibility issues.
Thank you for your attention to this matter. If further information or logs are required, please let me know.