custom training for yolact to onnx conversion

saisubramani commented 4 years ago

@Ma-Dan Thanks for the script, I am trying to convert the yolact model to onnx model, i referred your script and i setup all the dependency as you mentioned, while i started the training on my custom data-set, it through error.

/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py:134: UserWarning: Found GPU0 GRID K520 which is of cuda capability 3.0. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability that we support is 3.5.

warnings.warn(old_gpu_warn % (d, name, major, capability[1])) /home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py:134: UserWarning: Found GPU1 GRID K520 which is of cuda capability 3.0. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability that we support is 3.5.

warnings.warn(old_gpu_warn % (d, name, major, capability[1])) /home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py:134: UserWarning: Found GPU2 GRID K520 which is of cuda capability 3.0. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability that we support is 3.5.

warnings.warn(old_gpu_warn % (d, name, major, capability[1])) /home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py:134: UserWarning: Found GPU3 GRID K520 which is of cuda capability 3.0. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability that we support is 3.5.

warnings.warn(old_gpu_warn % (d, name, major, capability[1])) Traceback (most recent call last): File "train.py", line 382, in train() File "train.py", line 143, in train yolact_net = Yolact() File "/home/ubuntu/efs_model/models/YOLACT/Modified_Yolact/yolact.py", line 395, in init self.backbone = construct_backbone(cfg.backbone) File "/home/ubuntu/efs_model/models/YOLACT/Modified_Yolact/backbone.py", line 437, in construct_backbone backbone = cfg.type(*cfg.args) File "/home/ubuntu/efs_model/models/YOLACT/Modified_Yolact/backbone.py", line 64, in init self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False) File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 332, in init False, _pair(0), groups, bias, padding_mode) File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 46, in init self.reset_parameters() File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 49, in reset_parameters init.kaiminguniform(self.weight, a=math.sqrt(5)) File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/init.py", line 315, in kaiminguniform return tensor.uniform_(-bound, bound) RuntimeError: CUDA error: no kernel image is available for execution on the device

can any one faced this issue, tell any suggestion for this error. @dbolya @abhigoku10

abhigoku10 commented 4 years ago

@saisubramani sorry for delay in response , no i did not face this issue!!i shall re-check

saisubramani commented 4 years ago

@saisubramani sorry for delay in response , no i did not face this issue!!i shall re-check @abhigoku10 hi @Ma-Dan had replied and he said that we can use the source code to start the training, and after getting the model file which is in .pth format i want to convert that into onnx model . i started custom training and i got the .pth now i am trying to convert it into onnx, i am using the code which was suggested by De-Man in his code , i am using eval.py script, while running the script it shows an error, can you able to sort out this error

home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py:134: UserWarning:
Found GPU0 GRID K520 which is of cuda capability 3.0.
PyTorch no longer supports this GPU because it is too old.
The minimum cuda capability that we support is 3.5.

warnings.warn(old_gpu_warn % (d, name, major, capability[1]))
/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py:134: UserWarning:
Found GPU1 GRID K520 which is of cuda capability 3.0.
PyTorch no longer supports this GPU because it is too old.
The minimum cuda capability that we support is 3.5.

warnings.warn(old_gpu_warn % (d, name, major, capability[1]))
/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py:134: UserWarning:
Found GPU2 GRID K520 which is of cuda capability 3.0.
PyTorch no longer supports this GPU because it is too old.
The minimum cuda capability that we support is 3.5.

warnings.warn(old_gpu_warn % (d, name, major, capability[1]))
/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py:134: UserWarning:
Found GPU3 GRID K520 which is of cuda capability 3.0.
PyTorch no longer supports this GPU because it is too old.
The minimum cuda capability that we support is 3.5.

warnings.warn(old_gpu_warn % (d, name, major, capability[1]))
Traceback (most recent call last):
File "train.py", line 382, inhow can
train()
File "train.py", line 143, in train
yolact_net = Yolact()
File "/home/ubuntu/efs_model/models/YOLACT/Modified_Yolact/yolact.py", line 395, in init
self.backbone = construct_backbone(cfg.backbone)
File "/home/ubuntu/efs_model/models/YOLACT/Modified_Yolact/backbone.py", line 437, in construct_backbone
backbone = cfg.type(*cfg.args)
File "/home/ubuntu/efs_model/models/YOLACT/Modified_Yolact/backbone.py", line 64, in init
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False)
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 332, in init
False, pair(0), groups, bias, padding_mode)
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 46, in init
self.reset_parameters()
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 49, in reset_parameters
init.kaiming_uniform(self.weight, a=math.sqrt(5))
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/init.py", line 315, in kaiming_uniform_
return tensor.uniform_(-bound, bound)
RuntimeError: CUDA error: no kernel image is available for execution on the device

Modification made by me :

hi i had founded that you are using eval.py script for converting the yolact model to onnx model i having a doubt pred_outs = net(batch) This give a list which having an size of 1, how you are using the index in , preds = detect({'loc': pred_outs[0], 'conf': pred_outs[1], 'mask':pred_outs[2], 'priors': pred_outs[3], 'proto': pred_outs[4]})

it showing

IndexError: list index out of range

so what i did is , i just added few lines pred_outs = dict(pred_outs[0]) pred_outs=pred_outs['detection']

now its in dictionary format by using the key value i can take the values of detection, but when i cross checked the detection which is in (dictionary format) it having a key values of

('mask','class','score','proto','net')

what value can i assign for the

pred_out[0],pred_outs[1],pred_outs[2],pred_outs[3],pred_outs[4]

In my understanding 'conf' ':mean score,'mask':mask,'proto': means proto what about 'loc' and 'priors'

preds = detect({'loc': pred_outs[0], 'conf': pred_outs[1], 'mask':pred_outs[2], 'priors': pred_outs[3], 'proto': pred_outs[4]})

i tried like this preds = detect({'loc': pred_outs['box'], 'conf': pred_outs['score'], 'mask':pred_outs['mask'], 'priors': pred_outs['class'], 'proto': pred_outs['proto']})

It showing Error:

TypeError: call() missing 1 required positional argument: 'net' can you help me to sort this issue ? if i am wrong please tell me. Thanks for the reply @abhigoku10

dzyjjpy commented 4 years ago

@saisubramani have you solved the issue? I have similar issue when converting onnx to coreml for yolact(madan's coreml branch)

saisubramani commented 4 years ago

@saisubramani have you solved the issue? I have similar issue when converting onnx to coreml for yolact(madan's coreml branch)

hi, @dzyjjpy can i know what is the error? i did'nt solve that one i want to work on that. at what place you are facing the error.give the details so that i can check it!

dzyjjpy commented 4 years ago

https://github.com/dbolya/yolact/issues/374 @saisubramani

dbolya / yolact

custom training for yolact to onnx conversion #362