princeton-vl / Coupled-Iterative-Refinement

MIT License
104 stars 20 forks source link

RuntimeError: CUDA error: no kernel image is available for execution on the device #6

Open kooshyarkosari opened 1 year ago

kooshyarkosari commented 1 year ago

Dear Author Thank you very much for your excellent and amazing work: I tried to replicate the demo file but got a flowing error.

Screenshot (78)

configuration is :

1)Nvidia / cuda11.30-deve-ubuntu 20.94 (docker container) 2) torch-1.8.0+cu111

Screenshot (80)

lahavlipson commented 1 year ago

Did you manage to resolve the issue? If not, have you tried reinstalling the conda environment? I've seen this message when reusing pre-built binaries after I've updated my cuda version.

kooshyarkosari commented 1 year ago

Hi,

not yet unfurtentelly. I have tried different ubuntu versions as well as the Cuda version, moreover, I have reinstalled conda environment but the issue still remained

lahavlipson commented 1 year ago

I was able to reproduce the problem on a Tesla K80, but I wasn't able to find a solution unfortunately.

It looks like lietorch needs pytorch>=1.7, but this pytorch version can cause the aforementioned issue on this particular graphics card.

This issue doesn't seem to happen on a GTX-1080 or any newer cards. I'll keep looking and update this thread if I find a solution.

kooshyarkosari commented 1 year ago

ok thank you very much

dudulry commented 1 year ago

do you have a solution?

kooshyarkosari commented 1 year ago

not yet unfurtentelly.

dudulry commented 1 year ago

not yet unfurtentelly.

Have you used APEX?I encountered this error after using APEX. I fixed it by creating a new environment with RTX3090, CUDA 11.6, Python 3.7, and Torch 1.12.now it can work.

kooshyarkosari commented 1 year ago

no , since I have just access to Tesla K80 GPU

lin-fangzhou commented 1 year ago

Excuse me, has anyone encountered this problem?

(cir) bimlab@bimlab-server:~/pporzz/Coupled-Iterative-Refinement$ python demo.py --obj_models lmo --scene_dir /home/bimlab/pporzz/Coupled-Iterative-Refinement --load_weights model_weights/refiner/ycbv_rgbd.pth

/home/bimlab/miniconda3/envs/cir/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/home/bimlab/miniconda3/envs/cir/lib/python3.8/site-packages/torchvision/image.so: undefined symbol: _ZN3c104warnERKNS_7WarningE'If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source? warn( terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Aborted (core dumped)