uber-research / UPSNet

UPSNet: A Unified Panoptic Segmentation Network
645 stars 120 forks source link

RuntimeError: CUDA error: no kernel image is available for execution on the device #150

Closed VODKA312 closed 1 year ago

VODKA312 commented 1 year ago

My software and hardware configuration: ubuntu 22.04 CUDA 11.8 graphics card 4060Ti docker and docker -gpu- container, the docker image is 'upsnet'. you can find it in docker hub.

I try to run it. But I found my graphics card and OS cannot support pytorch 1.0.0 and pytorch 0.4.1. So I use docker image to solve this problem. In docker image, the cuda version is 10, and the torch.is_avilable() and nvidia-smi is ok. and Its implement 'init.sh' in UPSNET sucessfully. But, When i try to test the modle, many errros are throw out, the detailed informtion is list here:

/usr/local/lib/python3.6/dist-packages/torch/nn/_reduction.py:43: UserWarning: size_average and reduce args will be deprecated, please use reduction='none' instead.
2023-09-01 07:40:56,396 | json_dataset.py | line 63 : Creating: cityscapes_val
loading annotations into memory...
Done (t=0.34s)
creating index...
index created!
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.6.bn1.weight" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.7.bn1.running_var" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.8.bn3.num_batches_tracked" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.9.conv1.weight" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.10.bn1.running_var" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.11.bn2.bias" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.12.conv1.weight" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.13.conv2.weight" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.14.conv1.weight" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.15.bn1.running_mean" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.16.bn1.running_mean" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.17.conv1.weight" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.18.bn1.running_mean" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.19.conv1.weight" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.20.conv1.weight" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.21.conv1.weight" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
upsnet/../upsnet/models/resnet.py:285: UserWarning: unexpected key "resnet_backbone.res4.layers.22.bn1.weight" in state_dict
  warnings.warn('unexpected key "{}" in state_dict'.format(name))
....same as before
Traceback (most recent call last):
  File "upsnet/upsnet_end2end_test.py", line 316, in <module>
  File "upsnet/upsnet_end2end_test.py", line 247, in upsnet_test
    output = test_model(*batch)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "upsnet/../lib/utils/data_parallel.py", line 110, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "upsnet/../upsnet/models/resnet_upsnet.py", line 90, in forward
    res2, res3, res4, res5 = self.resnet_backbone(data['data'])
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "upsnet/../upsnet/models/resnet.py", line 349, in forward
    conv1 = self.conv1(x).detach() if self.freeze_at == 1 else self.conv1(x)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "upsnet/../upsnet/models/resnet.py", line 173, in forward
    x = self.relu(x)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/activation.py", line 94, in forward
    return F.relu(input, inplace=self.inplace)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 911, in relu
    result = torch.relu_(input)
RuntimeError: CUDA error: no kernel image is available for execution on the device

I just want to know how to solve this problem. I am exhaused about it. I think it might be some wrong in my grahpics card or gpu or torch_version.

VODKA312 commented 1 year ago

I am reproducing this model. But it happend many errors. I just don't know how to figure it out.

VODKA312 commented 1 year ago

Sorry for my stupid question. The reason is my cuda cnn is unmatched about my graphic card. I changed another graphic card that is 2080Ti and cuda version is 10. It works. So if other people want to reproduce this code. My suggestions is use older than 20x graphics card.