open-mmlab / mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
https://mmaction2.readthedocs.io
Apache License 2.0
4.27k stars 1.24k forks source link

Error in converting SlowFast model to onnx using tools/pytorch2onnx.py #756

Open arvindchandel opened 3 years ago

arvindchandel commented 3 years ago

Getting error like below while executing the conversion script: python3 tools/pytorch2onnx.py /home/dev3/Documents/new_mmaction/mmaction2/work_dirs/ava/slowfast_kinetics_pretrained_r50_4x16x1_20e_ava_rgb_new-data_e20-2class/custom_slowfast.py /home/dev3/Documents/new_mmaction/mmaction2/work_dirs/ava/slowfast_kinetics_pretrained_r50_4x16x1_20e_ava_rgb_new-data_e20-2class/best_mAP@0.5IOU_epoch_1.pth --is-localizer /home/dev3/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/functional.py:3103: UserWarning: The default behavior for interpolate/upsample with float scale_factor changed in 1.6.0 to align with other frameworks/libraries, and now uses scale_factor directly, instead of relying on the computed output size. If you wish to restore the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details. warnings.warn("The default behavior for interpolate/upsample with float scale_factor changed " /home/dev3/mmdetection/mmdet/core/bbox/transforms.py:70: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if bboxes.size(0) > 0: Traceback (most recent call last): File "tools/pytorch2onnx.py", line 163, in verify=args.verify) File "tools/pytorch2onnx.py", line 74, in pytorch2onnx opset_version=opset_version) File "/home/dev3/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/onnx/init.py", line 230, in export custom_opsets, enable_onnx_checker, use_external_data_format) File "/home/dev3/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/onnx/utils.py", line 91, in export use_external_data_format=use_external_data_format) File "/home/dev3/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/onnx/utils.py", line 639, in _export dynamic_axes=dynamic_axes) File "/home/dev3/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/onnx/utils.py", line 411, in _model_to_graph use_new_jit_passes) File "/home/dev3/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/onnx/utils.py", line 379, in _create_jit_graph graph, torch_out = _trace_and_get_graph_from_model(model, args) File "/home/dev3/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/onnx/utils.py", line 342, in _trace_and_get_graph_from_model torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True) File "/home/dev3/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/jit/_trace.py", line 1148, in _get_trace_graph outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, kwargs) File "/home/dev3/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, kwargs) File "/home/dev3/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/jit/_trace.py", line 130, in forward self._force_outplace, File "/home/dev3/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/jit/_trace.py", line 116, in wrapper outs.append(self.inner(trace_inputs)) File "/home/dev3/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 725, in _call_impl result = self._slow_forward(input, kwargs) File "/home/dev3/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 709, in _slow_forward result = self.forward(*input, kwargs) File "/home/dev3/mmdetection/mmdet/models/detectors/two_stage.py", line 101, in forward_dummy roi_outs = self.roi_head.forward_dummy(x, proposals) File "/home/dev3/mmdetection/mmdet/models/roi_heads/standard_roi_head.py", line 60, in forward_dummy bbox_results = self._bbox_forward(x, rois) TypeError: _bbox_forward() missing 1 required positional argument: 'img_metas'

irvingzhang0512 commented 3 years ago

I don't think pytorch2onnx.py supports spatio-temporal action detection models.

arvindchandel commented 3 years ago

@irvingzhang0512 thanks for quick response. if pytorch2onnx.py dosn't support then any other alternative to convert it onnx or tensorrt to optimize the model.

irvingzhang0512 commented 3 years ago

@irvingzhang0512 thanks for quick response. if pytorch2onnx.py dosn't support then any other alternative to convert it onnx or tensorrt to optimize the model.

Since spatio-temporal action detecction models use RoIAlign or RoIPool, it's not easy to convert pytorch models to onnx/tensorrt. You need build specific plugins to support RoIAlign/RoIPool(and other unsupported ops).

ZJU-lishuang commented 3 years ago

https://github.com/open-mmlab/mmcv/blob/48d990258549ca626fcf8c34488c00ed6fce108a/tests/test_ops/test_onnx.py#L150

arvindchandel commented 3 years ago

@ZJU-lishuang Can you explain little bit about the link you gave above, How it is helpful in converting SlowFast spatio-temporal model to ONNX?

ZJU-lishuang commented 3 years ago

in mmcv,there is RoIAlign code for cuda,onnx and pytorch. As @irvingzhang0512 said,you can build specific plugins to support RoIAlign. And the link is the code about the RoIAlign. You can refer it to convert model to ONNX without write the code again.

arvindchandel commented 3 years ago

@ZJU-lishuang sorry to bring you on the issue back, but i am confuse how to use RoIAlign function code provided in the link with pytorch2onnx.py file, to get it working for SlowFast temporal model.

arvindchandel commented 3 years ago

@innerlee @ZJU-lishuang any input, how to use RoIAlign code with pytorch2onnx.py, for conversion.

innerlee commented 3 years ago

For RoIAlign, I suggest consulting the mmdet team. They have some onnx experts.

Crush-yq commented 3 years ago

@arvindchandel Hello. Have you solved this problem? Did you successfully switch slowfast to onnx?

fengyaoluo commented 3 years ago

I try to output slowfast to onnx too but seeing the same error. Would you please let me know if you figure it out. Thank you!

arvindchandel commented 3 years ago

Hi @fengyaoluo @Crush-yq , No i could not find solution of it, plz post if you find some hack for it.

fengyaoluo commented 3 years ago

@arvindchandel Thank you for responding me! Our team ended up using Resnet18 trained on COCO to get the skeleton and then use LSTM trained on NTU dataset for action recognition as an alternative solution. In that sense, we are not using mmactions anymore but definitely hoping that they develop the feature to export to onnx in the future.

arvindchandel commented 3 years ago

@fengyaoluo okk sure. I tried that earlier through skeletons, but as its 2 step process so causes performance issue.

Learningm commented 3 years ago

Hi,I met the same problem, cannot export slowfast+R-CNN to onnx format. Did somebody figure it out ? I wonder whether it's ok to convert slowfast and R-CNN separately to onnx format. Thank you!

junaid340 commented 2 years ago

Im getting an error that is different from this one.. error logs posted in #1416 , Can anyone help?

sakh251 commented 2 years ago

Have found the solution? is it possible to convert them separately? They mentioned Slowfast on https://github.com/open-mmlab/mmaction2/blob/master/docs/tutorials/6_export_model.md as a supported model. How about using Yolo and SlowFast? YOLO can be found on ONNX.

namKolorfuL commented 2 years ago

@sakh251 but before SlowFast outputting detection results, it has to pass through roi_align and roi_pool layer to get detection result on multiple object. And since these 2 layers aren't supported by mmaction for conversion, I think we met the dead end here