Closed slavakx closed 5 months ago
Thanks - I'll have a look at it this week. It appears that the CUDA kernel is built incorrectly on newer CUDA versions. I am trying to reproduce this bug.
Maybe this is also relevant:
pgbm import works only for some python and cuda versions. I could import pgbm without errors only with the following configuration under Linux. In all other cases I got "error building extension 'split_decision'
python 3.9
conda install pytorch==1.12.0 torchvision==0.13.0 cudatoolkit=11.3 - pytorch
Then, I installed cuda_12.0.0_525.60.13_linux.run driver from nvidia website and configured PATH and LD_LIBRARY_PATH
export PATH = /usr/local/cuda-12.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH = ${LD_LIBRARY_PATH}:/usr/local/cuda-12.0/lib64
Thanks - I'll have a look at it this week. It appears that the CUDA kernel is built incorrectly on newer CUDA versions. I am trying to reproduce this bug.
Same error occurrs in the example02_housing_gpu.ipynb
Hi,
I've been trying to reproduce but unsuccesful unfortunately.... For now, there's a faster version of PGBM available on CPU through a fork on scikit-learn's HistGradientBoostingRegressor (docs). Maybe that can already help you. In the meantime, I'm still trying to reproduce.
Hi,
I've been trying to reproduce but unsuccesful unfortunately.... For now, there's a faster version of PGBM available on CPU through a fork on scikit-learn's HistGradientBoostingRegressor (docs). Maybe that can already help you. In the meantime, I'm still trying to reproduce.
Hi, Thanks for the information. I've managed to run it on linux using cuda 11.1 and 11.3. Tried to replicate same approach on Windows failed. So currently my problem is to run PGBM on Windows.
I still can't reproduce this issue, annoyingly... Could you (i) lay out the steps you took that led to the error on Windows (the packages you installed, and the order in which they were installed), and (ii) which versions you installed that generated the error?
I have Windows here too but everything runs fine - tried different versions of CUDA and all worked without issue. It must be some combination of Torch + Cuda that doesn't work, but I can't find it....
Closing this issue for inactivity, hope it was resolved on your end....
Hi,
I get the following error when I run on GPU. When running on CPU everything works fine: Input dimensions are: train: (72435, 413) target: (72435,)