NVIDIA-AI-IOT / torch2trt

An easy to use PyTorch to TensorRT converter
MIT License
4.61k stars 676 forks source link

Update `torch2trt` to use dynamic batch sizes up to the size given during conversion #752

Open chaoz-dev opened 2 years ago

chaoz-dev commented 2 years ago

Creating an issue to track the TODOs for this task:

We need to update torch2trt to support dynamic batch sizes, up to the size given during conversion. During first compilation of the model when given explicit static (batch size) tensor shapes, the subsequently compiled TRT model expects only that input shape during inference, resulting in tensors with smaller batch sizes to be rejected during inference as a result. This PR removes this restriction by using optimization profiles to support dynamic tensor shapes.

Note that this functionality was previously supported inherently due to the use of implicit batch sizes; with the move to explicit batch sizes (which is necessary for forward compatibility, as implicit batch sizes are now considered deprecated in TRT), we now need to explicitly support this functionality.

Also note that for this functionality to work, the batch size provided during inference must be smaller than or equal to the batch size provided during compilation (this is the same functionality as before).

This requires the following changes: ~1. Use dynamic shapes when creating the model~ [Completed]

  1. Update plugins to use said dynamic shapes
chaoz-dev commented 2 years ago

743 WIP PR addressing this issue

sleepLion99 commented 2 years ago

i use this patch, but it doesn't work. i convert efficient-net

` import os DEVICE = "1" os.environ["CUDA_VISIBLE_DEVICES"] = DEVICE import torch import tensorrt from torch2trt import torch2trt from efficientnet_pytorch import EfficientNet model = EfficientNet.from_pretrained('efficientnet-b4').eval().cuda() x = torch.ones((16, 3, 128, 128)).cuda() x2 = torch.ones((8, 3, 128, 128)).cuda() model_trt = torch2trt(model, [x], max_batch_size=16, log_level=tensorrt.Logger.INFO)

y = model_trt(x) print(y.shape) y2 = model_trt(x2) print(y2.shape) `

the error is Traceback (most recent call last): File "/home/lichaowei/deeplearning/classification/test.py", line 23, in <module> y2 = model_trt(x2) File "/home/lichaowei/anaconda3/envs/d2l/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/lichaowei/anaconda3/envs/d2l/lib/python3.8/site-packages/torch2trt-0.3.0-py3.8.egg/torch2trt/torch2trt.py", line 558, in forward shape = tuple(self.context.get_binding_shape(idx)) ValueError: __len__() should return >= 0

Creating an issue to track the TODOs for this task:

We need to update torch2trt to support dynamic batch sizes, up to the size given during conversion. During first compilation of the model when given explicit static (batch size) tensor shapes, the subsequently compiled TRT model expects only that input shape during inference, resulting in tensors with smaller batch sizes to be rejected during inference as a result. This PR removes this restriction by using optimization profiles to support dynamic tensor shapes.

Note that this functionality was previously supported inherently due to the use of implicit batch sizes; with the move to explicit batch sizes (which is necessary for forward compatibility, as implicit batch sizes are now considered deprecated in TRT), we now need to explicitly support this functionality.

Also note that for this functionality to work, the batch size provided during inference must be smaller than or equal to the batch size provided during compilation (this is the same functionality as before).

This requires the following changes: ~1. Use dynamic shapes when creating the model~ [Completed] 2. Update plugins to use said dynamic shapes

chaoz-dev commented 2 years ago

@sleepLion99 I tried running your above script but encountered the following error (on both master and on my PR):

Loaded pretrained weights for efficientnet-b4
[06/28/2022-00:25:52] [TRT] [I] [MemUsageChange] Init CUDA: CPU +189, GPU +0, now: CPU 999, GPU 1527 (MiB)
[06/28/2022-00:25:53] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +6, GPU +0, now: CPU 1024, GPU 1527 (MiB)
Traceback (most recent call last):
  File "/home/chaoz/workspace/ml/scratch/torch2trt-i752.py", line 14, in <module>
    model_trt = torch2trt(model, [x], max_batch_size=16, log_level=tensorrt.Logger.INFO)
  File "/home/chaoz/.anaconda3/envs/pytorch-1.11/lib/python3.10/site-packages/torch2trt-0.3.0-py3.10.egg/torch2trt/torch2trt.py", line 644, in torch2trt
    outputs = module(*inputs)
  File "/home/chaoz/.anaconda3/envs/pytorch-1.11/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1148, in _call_impl
    result = forward_call(*input, **kwargs)
  File "/home/chaoz/.anaconda3/envs/pytorch-1.11/lib/python3.10/site-packages/efficientnet_pytorch/model.py", line 314, in forward
    x = self.extract_features(inputs)
  File "/home/chaoz/.anaconda3/envs/pytorch-1.11/lib/python3.10/site-packages/efficientnet_pytorch/model.py", line 296, in extract_features
    x = block(x, drop_connect_rate=drop_connect_rate)
  File "/home/chaoz/.anaconda3/envs/pytorch-1.11/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1148, in _call_impl
    result = forward_call(*input, **kwargs)
  File "/home/chaoz/.anaconda3/envs/pytorch-1.11/lib/python3.10/site-packages/efficientnet_pytorch/model.py", line 109, in forward
    x = self._depthwise_conv(x)
  File "/home/chaoz/.anaconda3/envs/pytorch-1.11/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1148, in _call_impl
    result = forward_call(*input, **kwargs)
  File "/home/chaoz/.anaconda3/envs/pytorch-1.11/lib/python3.10/site-packages/efficientnet_pytorch/utils.py", line 275, in forward
    x = F.conv2d(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups)
  File "/home/chaoz/.anaconda3/envs/pytorch-1.11/lib/python3.10/site-packages/torch2trt-0.3.0-py3.10.egg/torch2trt/torch2trt.py", line 300, in wrapper
    converter["converter"](ctx)
  File "/home/chaoz/.anaconda3/envs/pytorch-1.11/lib/python3.10/site-packages/torch2trt-0.3.0-py3.10.egg/torch2trt/converters/conv_functional.py", line 46, in convert_Conv_trt7_functional
    layer.stride_nd = stride
TypeError: (): incompatible function arguments. The following argument types are supported:
    1. (arg0: tensorrt.tensorrt.IConvolutionLayer, arg1: tensorrt.tensorrt.Dims) -> None

Invoked with: <tensorrt.tensorrt.IConvolutionLayer object at 0x7f45b771a570>, ([1, 1], [1, 1])

This is running with the following configurations:

CUDA: 11.6
TRT: 8.4.1.5
PyTorch 1.11
Python 3.10.4

What are you running with? This looks like it's starting to convert, albeit with some issues for the convolution layer.