Closed aixiaodewugege closed 1 year ago
A full deploy config with dynamic shape should looks like this:
onnx_config = dict(
type='onnx',
export_params=True,
keep_initializers_as_inputs=False,
opset_version=11,
save_file='end2end.onnx',
input_names=['input'],
output_names=['output'],
input_shape=[256, 256],
optimize=True,
dynamic_axes=dict(
input=dict({
0: 'batch',
1: 'num_crops * num_segs',
3: 'time',
4: 'height',
5: 'width'
}),
output=dict({0: 'batch'})))
codebase_config = dict(type='mmaction', task='VideoRecognition')
backend_config = dict(
type='tensorrt',
common_config=dict(fp16_mode=False, max_workspace_size=1073741824),
model_inputs=[
dict(
input_shapes=dict(
input=dict(
min_shape=[1, 1, 3, 32, 256, 256],
opt_shape=[1, 64, 3, 32, 256, 256],
max_shape=[1, 128, 3, 32, 256, 256])))
])
There are two places you should focus on:
The dynamic for mmaction2 is ususlly caused by SampleFrames
and xxxCrop
(like ThreeCrop, CenterCrop).
There two parts affect dim1 and dim4 (dim1 = num_clips * xx crop, dim4 = clip_len). Therefore, if you want to extract different frames between videos, you have to set these two dim dynamic in onnx_config. And set appropriate min_shape/opt_shape/max_shape in backend_config of tensorrt (in this example, I only set dynamic dim1 between 1 to 128)
If your the input have save width and height, there is no need to set dim4 and dim5 to dynamic, so you can remove 4: 'height' and 5: 'width
A full deploy config with dynamic shape should looks like this:
onnx_config = dict( type='onnx', export_params=True, keep_initializers_as_inputs=False, opset_version=11, save_file='end2end.onnx', input_names=['input'], output_names=['output'], input_shape=[256, 256], optimize=True, dynamic_axes=dict( input=dict({ 0: 'batch', 1: 'num_crops * num_segs', 3: 'time', 4: 'height', 5: 'width' }), output=dict({0: 'batch'}))) codebase_config = dict(type='mmaction', task='VideoRecognition') backend_config = dict( type='tensorrt', common_config=dict(fp16_mode=False, max_workspace_size=1073741824), model_inputs=[ dict( input_shapes=dict( input=dict( min_shape=[1, 1, 3, 32, 256, 256], opt_shape=[1, 64, 3, 32, 256, 256], max_shape=[1, 128, 3, 32, 256, 256]))) ])
There are two places you should focus on:
- dynamic_axes
- input_shapes
The dynamic for mmaction2 is ususlly caused by
SampleFrames
andxxxCrop
(like ThreeCrop, CenterCrop). There two parts affect dim1 and dim4 (dim1 = num_clips * xx crop, dim4 = clip_len). Therefore, if you want to extract different frames between videos, you have to set these two dim dynamic in onnx_config. And set appropriate min_shape/opt_shape/max_shape in backend_config of tensorrt (in this example, I only set dynamic dim1 between 1 to 128)If your the input have save width and height, there is no need to set dim4 and dim5 to dynamic, so you can remove
4: 'height' and 5: 'width
Thank for your reply!
To be classify, ThreeCrop is already inside your sdk, so the model input shape is defined by ThreeCrop output size, so if I want to set height and width to dynamic I should change the ThreeCrop size in model config file before conver? Is that right?
You may not getting the point.
First, you should ask for yourself, what dynamic do you want? Is it the widht and height for a video frame? Or is it the number of frames you want to extract from each video?
For width and height, If you transform pipeline have a crop(no matter centercrop or threecrop), the image size after this transform is determined (the crop_size). Therefore, there is no need to set dynamic height or width.
For dynamic number of frames you extract from each videos, the dim1 = num_clips * xx crop, dim4 = clip_len
Got it. Thanks!
I closed the issue. If you have any other questions, feel free to open it.
Checklist
Describe the bug
I try to convert the csn model to trt format but failed when use dynamic input shape option.
Reproduction
python tools/deploy.py configs/mmaction/video-recognition/video-recognition_3d_dynamic.py ../mmaction2/configs/recognition/csn/ircsn_ig65m-pretrained-r152-bnfrozen_8xb12-32x2x1-58e_kinetics400-rgb.py ~/.cache/torch/hub/checkpoints/ircsn_ig65m_pretrained_bnfrozen_r152_32x2x1_58e_kinetics400_rgb_20200812-9037a758.pth ../mmaction2/demo/demo.mp4 --work-dir ../csn --dump-info --device cuda:0
I also tried to change the opt_shape in static config but also failed.
Environment
Error traceback