pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
https://pytorch.org/TensorRT
BSD 3-Clause "New" or "Revised" License
2.6k stars 351 forks source link

🐛 [Bug] Encountered bug when using Torch-TensorRT with a pytorch segmentation model #3254

Open deo-abhijit opened 1 month ago

deo-abhijit commented 1 month ago

Hi! I have been using torch_tensorrt for speedup of pytorch models and have been loving it. But sometimes i face problems while conversion.

In this case, i was using segmentation-models-pytorch(smp) library.

import segmentation_models_pytorch as smp
import torch_tensorrt as trt
import torch

model = smp.create_model(
    arch="fpn",                     # name of the architecture, e.g. 'Unet'/ 'FPN' / etc. Case INsensitive!
    encoder_name="mit_b0",
    encoder_weights="imagenet",
    in_channels=3,
    classes=3,
).eval().to('cuda')

input_data = torch.randn(1,3,224,224,requires_grad=False).to('cuda')
scripted_model = torch.jit.trace(model, input_data )
trt_model = trt.compile(
                scripted_model,
                inputs = [trt.Input((1,3,736,1280),precision = torch.float32)],
                enabled_precisions={torch.float32},truncate_long_and_double = True)

I got the following error.

WARNING:root:Given dtype that does not have direct mapping to torch (dtype.unknown), defaulting to torch.float
WARNING:torch_tensorrt._compile:Input is a torchscript module but the ir was not specified (default=dynamo), please set ir=torchscript to suppress the warning.
WARNING:root:Given dtype that does not have direct mapping to torch (dtype.unknown), defaulting to torch.float
ERROR: [Torch-TensorRT TorchScript Conversion Context] - [graphShapeAnalyzer.cpp::checkCalculationStatusSanity::1660] Error Code 2: Internal Error (Assertion !isPartialWork(p.second.symbolicRep) failed. )
ERROR: [Torch-TensorRT TorchScript Conversion Context] - [graphShapeAnalyzer.cpp::checkCalculationStatusSanity::1660] Error Code 2: Internal Error (Assertion !isPartialWork(p.second.symbolicRep) failed. )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mzcar/miniconda3/envs/tensorrt/lib/python3.10/site-packages/torch_tensorrt/_compile.py", line 208, in compile
    compiled_ts_module: torch.jit.ScriptModule = torchscript_compile(
  File "/home/mzcar/miniconda3/envs/tensorrt/lib/python3.10/site-packages/torch_tensorrt/ts/_compiler.py", line 156, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: [Error thrown at core/conversion/converters/converter_util.cpp:270] Expected const_layer to be true but got false

This especially occurs when im using mit_b0 as backbone. for other resnet based backbone, im getting good speedups.

Hence I am not much dependant on this backbone but i would love to know why this conversion is failing. If anyone could help in this, it would be helpful.

If anyone interested, output for pip freeze is

```bash autocommand==2.2.2 backports.tarfile==1.2.0 certifi==2024.8.30 charset-normalizer==3.4.0 coloredlogs==15.0.1 contourpy==1.3.0 cycler==0.12.1 efficientnet_pytorch==0.7.1 filelock==3.13.1 flatbuffers==24.3.25 fonttools==4.54.1 fsspec==2024.2.0 huggingface-hub==0.25.2 humanfriendly==10.0 idna==3.10 importlib_metadata==8.0.0 importlib_resources==6.4.0 inflect==7.3.1 jaraco.collections==5.1.0 jaraco.context==5.3.0 jaraco.functools==4.0.1 jaraco.text==3.12.1 Jinja2==3.1.3 kiwisolver==1.4.7 Mako==1.3.5 MarkupSafe==2.1.5 matplotlib==3.9.2 more-itertools==10.3.0 mpmath==1.3.0 munch==4.0.0 networkx==3.2.1 numpy==1.26.3 nvidia-cublas-cu11==11.11.3.6 nvidia-cuda-cupti-cu11==11.8.87 nvidia-cuda-nvrtc-cu11==11.8.89 nvidia-cuda-runtime-cu11==11.8.89 nvidia-cuda-runtime-cu12==12.6.77 nvidia-cudnn-cu11==9.1.0.70 nvidia-cufft-cu11==10.9.0.58 nvidia-curand-cu11==10.3.0.86 nvidia-cusolver-cu11==11.4.1.48 nvidia-cusparse-cu11==11.7.5.86 nvidia-nccl-cu11==2.20.5 nvidia-nvtx-cu11==11.8.86 onnx==1.17.0 onnx_tensorrt==10.5.0 onnxruntime-gpu==1.19.2 openvino==2023.1.0.dev20230811 openvino-telemetry==2024.1.0 packaging==24.1 pandas==2.2.3 pillow==10.2.0 platformdirs==4.3.6 pretrainedmodels==0.7.4 protobuf==5.28.2 pycuda==2024.1.2 pyparsing==3.2.0 python-dateutil==2.9.0.post0 pytools==2024.1.14 pytz==2024.2 PyYAML==6.0.2 regex==2024.9.11 requests==2.32.3 safetensors==0.4.5 scipy==1.14.1 seaborn==0.13.2 segmentation-models-pytorch==0.3.4 six==1.16.0 sympy==1.13.1 tensorrt==10.1.0 tensorrt-cu12==10.5.0 tensorrt-cu12-bindings==10.1.0 tensorrt-cu12-libs==10.1.0 timm==0.9.7 tokenizers==0.20.1 tomli==2.0.1 torch==2.4.1+cu118 torch_tensorrt==2.4.0+cu118 torchvision==0.19.1+cu118 tqdm==4.66.5 transformers==4.45.2 triton==3.0.0 typeguard==4.3.0 typing_extensions==4.9.0 tzdata==2024.2 urllib3==2.2.3 zipp==3.19.2 ```
narendasan commented 1 month ago

@deo-abhijit have you tried using the dynamo frontend instead of torchscript? might resolve this issue. You can still use torchscript for deployment after by tracing the compiled program with torch.jit.script