🐛 [Bug] Cannot compile ViT to TensorRT

SohamTamba commented 1 year ago

Bug Description

I tried to compile HuggingFace's transformers.models.vit.modeling_vit.ViTForImageClassification using the official instructions but it fails

To Reproduce

import transformers
import torch
import torch_tensorrt

import faulthandler
faulthandler.enable()

def cli_main():
    model = transformers.ViTForImageClassification.from_pretrained(
        'google/vit-base-patch16-224-in21k', torchscript=True
    )
    model = model.eval().cuda()

    inputs_trt = [
        torch_tensorrt.Input(
            max_shape=[32, 3, 224, 224],
            opt_shape=[32, 3, 224, 224],
            min_shape=[1, 3, 224, 224],
            dtype=torch.float32,
        ),
    ]
    inputs_dummy = torch.rand((32, 3, 224, 224), dtype=torch.float32, device='cuda')

    enabled_precisions = {torch.float,}

    traced_model = torch.jit.trace(model, inputs_dummy)

    torch_tensorrt.compile(
        traced_model, inputs=inputs_trt, enabled_precisions=enabled_precisions
    )

if __name__ == '__main__':
    cli_main()

Error Logs

Fatal Python error: Segmentation fault

Current thread 0x00007f7ef3695700 (most recent call first):
  File "/root/anaconda3/envs/tensor_rt/lib/python3.9/site-packages/torch_tensorrt/ts/_compiler.py", line 139 in compile
  File "/root/anaconda3/envs/tensor_rt/lib/python3.9/site-packages/torch_tensorrt/_compile.py", line 133 in compile
  File "/root/soham/mint_matcher/tensor_rt/compile/mwe.py", line 34 in cli_main
  File "/root/soham/mint_matcher/tensor_rt/compile/mwe.py", line 39 in <module>
Segmentation fault (core dumped)

Expected behavior

Compile the model so that inference is much faster

Environment

Torch-TensorRT Version: 1.4.0
PyTorch Version: 2.0.1
CPU Architecture: x86_64
OS (e.g., Linux): Ubuntu 16.04.7 LTS
How you installed PyTorch: conda

Build command you used:

pip install nvidia-pyindex
pip install nvidia-tensorrt
pip install torch-tensorrt

Are you using local sources or building from archives: No
Python version: 3.9.11
CUDA version: 11.7
GPU models and configuration: V100
Any other relevant information:

Additional context

gs-olive commented 1 year ago

Hi @SohamTamba - thank you for the details. I am unable to reproduce this error on the latest main, but I do see an error displaying when compiling with dynamic shapes, which I've reported here: #2075. When using static shapes, inference is functional on main when I compile the model with the following arguments:

    inputs_trt = [
        torch_tensorrt.Input(
            shape=[32, 3, 224, 224],
            dtype=torch.float32,
        ),
    ]

...

    torch_tensorrt.compile(
        traced_model, inputs=inputs_trt, enabled_precisions=enabled_precisions, truncate_long_and_double=True,
    )

Additionally, another method to try is the new torch_tensorrt.dynamo.compile API, which can be invoked as follows:

    optimized_model = torch_tensorrt.dynamo.compile(
        model,
        inputs=inputs_trt,
        enabled_precisions=enabled_precisions,
        min_block_size=10,
    )

I will try to reproduce the issue under the specific build conditions you described as well, and I will follow up.

gs-olive commented 1 year ago

I tested the code in the described environment (PyTorch 2.0.1, Torch-TRT 1.4.0). I am using Ubuntu 20.04 and CUDA 11 distributions for linked PyTorch packages. I was unable to reproduce the error as described.

@SohamTamba - could you provide the debug logs from your run as well, to further diagnose the issue? These can be obtained by wrapping the compilation in the context:

with torch_tensorrt.logging.debug():
    ...

SohamTamba commented 1 year ago

Hi @gs-olive Here are the logs

trt.txt

SohamTamba commented 1 year ago

The Tensor RT fix you recommended still gave the same error

tensor_rt.dynamo worked - it compiled but I am yet to check if the output is correct.

gs-olive commented 1 year ago

Hi @SohamTamba - thank you for the logs, this is very helpful. I am working on reproducing more aspects of your environment exactly, including the CUDA, Python, and GPU versions. Could you try the run once more on your machine, with the following changes:

1. Add import faulthandler; faulthandler.enable() to the top of the file, to gather more details about the segfault 2. Add torch_tensorrt.dump_build_info() to the top of the file 3. Switch the context with torch_tensorrt.logging.debug(): to with torch_tensorrt.logging.graphs(): (the logs will be much longer, and will contain all graph data) 4. Use static-shape inputs, as so:

    inputs_trt = [
        torch_tensorrt.Input(
            shape=[32, 3, 224, 224],
            dtype=torch.float32,
        ),
    ]

...

    torch_tensorrt.compile(
        traced_model, inputs=inputs_trt, enabled_precisions=enabled_precisions, truncate_long_and_double=True,
    )

gs-olive commented 1 year ago

To reproduce your environment, I used the following Dockerfile, which is a shortened version of the Torch-TensorRT Dockerfile:

Dockerfile

```dockerfile # Base image starts with CUDA ARG BASE_IMG=nvidia/cuda:11.7.1-devel-ubuntu18.04 FROM ${BASE_IMG} as base ENV BASE_IMG=nvidia/cuda:11.7.1-devel-ubuntu18.04 ARG PYTHON_VERSION=3.9.11 ENV PYTHON_VERSION=${PYTHON_VERSION} ENV DEBIAN_FRONTEND=noninteractive # Install basic dependencies RUN apt-get update RUN apt install -y build-essential manpages-dev wget zlib1g software-properties-common git libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget ca-certificates curl llvm libncurses5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev mecab-ipadic-utf8 # Install PyEnv and desired Python version ENV HOME="/root" ENV PYENV_DIR="$HOME/.pyenv" ENV PATH="$PYENV_DIR/shims:$PYENV_DIR/bin:$PATH" RUN wget -L https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-installer &&\ chmod 755 pyenv-installer &&\ bash pyenv-installer &&\ eval "$(pyenv init -)" RUN pyenv install -v ${PYTHON_VERSION} RUN pyenv global ${PYTHON_VERSION} CMD /bin/bash ```

This gives me an environment with the following versions: Python: 3.9.11 GPU: V100 Ubuntu: 18.04 Torch: 2.0.1 Torch-TensorRT: 1.4.0 CUDA: 11.7.1

The compilation is functional for static shapes on this configuration, on my machine. I think the expanded debug logs will help to deduce the error in this case.

github-actions[bot] commented 12 months ago

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

pytorch / TensorRT