teddykoker / torchsort

Fast, differentiable sorting and ranking in PyTorch
https://pypi.org/project/torchsort/
Apache License 2.0
765 stars 33 forks source link

cuda TypeError: 'NoneType' object is not callable #33

Closed shuiyuejihua closed 2 years ago

shuiyuejihua commented 2 years ago
>>> import torch
>>> import torchsort
>>> x = torch.tensor([[8., 0., 5., 3., 2., 1., 6., 7., 9.]], requires_grad=True).cuda()
>>> y = torchsort.soft_sort(x)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/shuiy/anaconda3/envs/pytorch_py3/lib/python3.7/site-packages/torchsort/ops.py", line 48, in soft_sort
    return SoftSort.apply(values, regularization, regularization_strength)
  File "/home/shuiy/anaconda3/envs/pytorch_py3/lib/python3.7/site-packages/torchsort/ops.py", line 132, in forward
    sol = isotonic_l2[s.device.type](w - s)
TypeError: 'NoneType' object is not callable

on jupyter notebook is:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_8938/1075883647.py in <module>
      1 x = torch.tensor([[8., 0., 5., 3., 2., 1., 6., 7., 9.]], requires_grad=True).cuda()
----> 2 y = torchsort.soft_sort(x)

~/anaconda3/envs/pytorch_py3/lib/python3.7/site-packages/torchsort/ops.py in soft_sort(values, regularization, regularization_strength)
     46     if regularization not in ["l2", "kl"]:
     47         raise ValueError(f"'regularization' should be a 'l2' or 'kl'")
---> 48     return SoftSort.apply(values, regularization, regularization_strength)
     49 
     50 

~/anaconda3/envs/pytorch_py3/lib/python3.7/site-packages/torchsort/ops.py in forward(ctx, tensor, regularization, regularization_strength)
    130         # note reverse order of args
    131         if ctx.regularization == "l2":
--> 132             sol = isotonic_l2[s.device.type](w - s)
    133         else:
    134             sol = isotonic_kl[s.device.type](w, s)

TypeError: 'NoneType' object is not callable

if x is on cpu(), run code is ok python 3.7.10, pytorch 1.9.0 , cudatoolkit=11.1, ubuntu 18.04

teddykoker commented 2 years ago

Hi, it looks like your current version of torchsort doesn't have the cuda version built. Can you try reinstalling with:

pip install --force-reinstall --no-cache-dir torchsort
shuiyuejihua commented 2 years ago

Hi, it looks like your current version of torchsort doesn't have the cuda version built. Can you try reinstalling with:

pip install --force-reinstall --no-cache-dir torchsort

I tried, and it reinstalled the pytorch and torchsort , and RTX3060 warning, the problem may be about the RTX3060

>>> import torch
>>> import torchsort
>>> x = torch.tensor([[8., 0., 5., 3., 2., 1., 6., 7., 9.]], requires_grad=True).cuda()
GeForce RTX 3060 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the GeForce RTX 3060 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))

and ,

>>> y = torchsort.soft_sort(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/shuiy/anaconda3/envs/pytorch_py3/lib/python3.7/site-packages/torchsort/ops.py", line 48, in soft_sort
    return SoftSort.apply(values, regularization, regularization_strength)
  File "/home/shuiy/anaconda3/envs/pytorch_py3/lib/python3.7/site-packages/torchsort/ops.py", line 126, in forward
    w = (_arange_like(tensor, reverse=True) + 1) / regularization_strength
  File "/home/shuiy/anaconda3/envs/pytorch_py3/lib/python3.7/site-packages/torchsort/ops.py", line 66, in _arange_like
    ar = torch.arange(x.shape[1] - 1, -1, -1, dtype=x.dtype, device=x.device)
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
teddykoker commented 2 years ago

From https://pytorch.org/get-started/locally/ it looks like you need to install pytorch with:

conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia

After which torchsort should hopefully work after reinstalling

shuiyuejihua commented 2 years ago

Thanks,but,It still doesn't work. I will use the cpu only

teddykoker commented 2 years ago

What error are you getting now?

shuiyuejihua commented 2 years ago

the RTX3060 only can be installed like conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia, >=cudatoolkit 11, from nvidia channel . I installed the pytorch first time like this. So, I guess the problem is about the incompatible with new cuda versions for RTX30xx

si-aspartame commented 2 years ago

In my case, The reason of this issue was that I didn't have NVCC. If you installed cudatoolkit using pip or conda, it doesn't include NVCC. I recommend that people who has this issue get additionally CUDA from NVIDIA and reinstall torchsort.

jefferythewind commented 2 years ago

Hi, Yes I have had the same problem running on my GPU.

teddykoker commented 2 years ago

Please also see #35 for conda installation issues

xfcxing commented 2 years ago

i met the same issue

teddykoker commented 2 years ago

I realize now that initializing the functions to None if the function is not available (here) results in a very unhelpful error message when the kernel has not been compiled - I will add proper error handling here to inform the user that the cuda extension needs to be compiled