pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
https://pytorch.org/TensorRT
BSD 3-Clause "New" or "Revised" License
2.49k stars 344 forks source link

🐛 [Bug] Failed to compile with torch-trt when using torch.split() #952

Closed Njuapp closed 2 years ago

Njuapp commented 2 years ago

Bug Description

When I used torch.split() in the model code, it will fails when trying to convert with Torch-TRT. Complete error messages:

torch.Size([1000, 2048])
Successfully load torch model:
def forward(self,
    x: Tensor) -> Tensor:
  fc = self.fc
  basenet = self.basenet
  _0 = (fc).forward((basenet).forward(x, ), )
  return _0

TRTorch {self.trt_dtype} compile begin >>>
WARNING: [Torch-TensorRT] - Dilation not used in Max pooling converter
Traceback (most recent call last):
  File "t.py", line 114, in <module>
    model = torch_tensorrt.compile(model, **compile_settings)
  File "/opt/conda/lib/python3.7/site-packages/torch_tensorrt/_compile.py", line 115, in compile
    return torch_tensorrt.ts.compile(ts_mod, inputs=inputs, enabled_precisions=enabled_precisions, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch_tensorrt/ts/_compiler.py", line 116, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: [Error thrown at core/conversion/var/Var.cpp:132] Expected isITensor() to be true but got false
Requested ITensor from Var, however Var type is c10::IValue

To Reproduce

import torch
import math
import numpy as np
from torchvision import models

class SplitModel(torch.nn.Module):
    def __init__(self, feat_dim=2048, num_head=2):
        super(SplitModel, self).__init__()
        self.num_head = num_head
        self.head_dim = feat_dim // num_head

    def forward(self, input):
        assert len(input.shape) == 2
        split_list = torch.split(input, self.head_dim, dim=1)
        assert len(split_list) == self.num_head
        return torch.cat(split_list, dim=1)

class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        """ Base network """
        self.basenet = models.resnet50(pretrained=True)
        self.basenet.fc = torch.nn.Identity()
        self.fc = SplitModel()

    def forward(self, x, test=True):
        """ Base network """
        x = self.basenet(x)
        return self.fc(x)
model = Net()
model = model.eval()
model = model.cuda()

input = np.zeros((1, 3, 224, 224)).astype(np.float32)
input = torch.from_numpy(input).cuda()
output = model(input)

torch_script_module = torch.jit.trace(model, (input, ))
torch.jit.save(torch_script_module, "torch_script_module.ts")
import torch_tensorrt 
import torch
torch_script_module = torch.jit.load("torch_script_module.ts", map_location='cuda').eval()

print('Successfully load torch model:')
trt_dtype = torch.float16
            #self.trt_dtype = torch.float32

compile_settings = {
    "inputs": [torch_tensorrt.Input(
        shape=[1, 3, 224, 224],  # TODO: depends on the model size
        dtype=torch.float32,  # Datatype of input tensor. Allowed options torch.(float|half|int8|int32|bool)
    )],
    "require_full_compilation": False,
    "enabled_precisions": {torch.float32},  # Run with FP16
    "truncate_long_and_double": True,
}

print(torch_script_module.code)
print("TRTorch {self.trt_dtype} compile begin >>>")
trt_ts_module = torch_tensorrt.compile(torch_script_module, **compile_settings)

torch_script_output = torch_script_module(input)
trt_ts_output = trt_ts_module(input)
diff = abs(torch_script_output - trt_ts_output).mean()
print('Diff between torchscript output and trt_ts_module output is {:.6f}'.format(diff))

Expected behavior

The Torch-TRT should compile successfully, and the numerical difference between trt-embedded model and original model should be negligible.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

edric1261234 commented 2 years ago

mark