WisconsinAIVision / yolact_edge

The first competitive instance segmentation approach that runs on small edge devices at real-time speeds.
MIT License
1.28k stars 273 forks source link

Apply float16 quantization on the model #98

Closed mathmax12 closed 3 years ago

mathmax12 commented 3 years ago

Hi, Thanks for the amazing work. I am working on a windows 10 OS with a RTX2060. The TensorRT is installed correctly. But I have some problem installing the torch2trt. So I can't run the with the "--use_fp16_tensorrt". I found the pytorch support quantization the model by simply add net.half() I also changed the input to the net to FP16. But I got some issue as below for the relu part in FPN_phase_2 class:

`File "C:\Users\bigtree\Documents\prototype\segmentation\yolact_edge_simple\yolact.py", line 1703, in forward outs = self.fpn_phase_2(outs_phase_1) File "C:\Users\bigtree.conda\envs\yolact_edgev01\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl result = self.forward(input, **kwargs) RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of TorchScript (most recent call last): File "C:\Users\bigtree\Documents\prototype\segmentation\yolact_edge_simple\yolact.py", line 969, in forward for pred_layer in self.pred_layers: j -= 1 out[j] = F.relu(pred_layer(out[j]))



        # In the original paper, this takes care of P6
  File "C:\Users\bigtree\Documents\prototype\segmentation\yolact_edge_simple\yolact.py", line 969, in relu
        for pred_layer in self.pred_layers:
            j -= 1
            out[j] = F.relu(pred_layer(out[j]))
                            ~~~~~~~~~~ <--- HERE

        # In the original paper, this takes care of P6
  File "C:\Users\bigtree\.conda\envs\yolact_edgev01\lib\site-packages\torch\nn\modules\conv.py", line 419, in forward
    def forward(self, input: Tensor) -> Tensor:
        return self._conv_forward(input, self.weight)
               ~~~~~~~~~~~~~~~~~~ <--- HERE
  File "C:\Users\bigtree\.conda\envs\yolact_edgev01\lib\site-packages\torch\nn\modules\conv.py", line 415, in _conv_forward
                            weight, self.bias, self.stride,
                            _pair(0), self.dilation, self.groups)
        return F.conv2d(input, weight, self.bias, self.stride,
               ~~~~~~~~ <--- HERE
                        self.padding, self.dilation, self.groups)
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same

[ WARN:0] global C:\Users\appveyor\AppData\Local\Temp\1\pip-req-build-oduouqig\opencv\modules\videoio\src\cap_msmf.cpp (434) `anonymous-namespace'::SourceReaderCB::~SourceReaderCB terminating async callback
`
Any suggestions will be appreciated.
haotian-liu commented 3 years ago

You need to change this to half tensor.

mathmax12 commented 3 years ago

@haotian-liu Thank you for the suggestion. After that change, I can run the FP16 model for YOLACT. But when I try to run the FP16 model for yolact_edge (yolact_edge_vid_resnet50_847_50000) The first frame inferernce is correct and it stops when try to run net(imgs, extras=txtras) image

Then I got the following issues:

`Traceback (most recent call last): File "eval.py", line 1010, in evaluate(net, dataset) File "eval.py", line 883, in evaluate evalvideo(net, args.video) File "eval.py", line 788, in evalvideo pred = eval_network(preprocessed) File "eval.py", line 725, in eval_network net_outs = net(imgs, extras=extras) File "C:\Users\bigtree.conda\envs\yolact_edgev01\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "C:\Users\bigtree\Documents\prototype\segmentation\yolact_edge_simple\yolact.py", line 1676, in forward pred_feat = deform_op(feat, flow) RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of TorchScript (most recent call last): File "C:\Users\bigtree\Documents\prototype\segmentation\yolact_edge_simple\layers\warp_utils.py", line 65, in deform_op grid = grid.permute(0, 2, 3, 1)

output = F.grid_sample(x, grid, mode=mode, padding_mode=padding_mode, align_corners=True)
         ~~~~~~~~~~~~~ <--- HERE

return output

File "C:\Users\bigtree.conda\envs\yolact_edgev01\lib\site-packages\torch\nn\functional.py", line 3390, in grid_sample align_corners = False

return torch.grid_sampler(input, grid, mode_enum, padding_mode_enum, align_corners)
       ~~~~~~~~~~~~~~~~~~ <--- HERE

RuntimeError: grid_sampler(): expected input and grid to have same dtype, but input has struct c10::Half and grid has float

[ WARN:0] global C:\Users\appveyor\AppData\Local\Temp\1\pip-req-build-oduouqig\opencv\modules\videoio\src\cap_msmf.cpp (434) anonymous-namespace'::SourceReaderCB::~SourceReaderCB terminating async callback

I guess this may related to the inference on the 2nd frame using the output from the first frame moving_statistics["feats"] = net_outs["feats"] moving_statistics["lateral"] = net_outs["lateral"] But not very sure how to fix this. Any suggestions are appreciated.