NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.77k stars 2.13k forks source link

How to load and enqueue multiple? #1230

Closed sinuku closed 3 years ago

sinuku commented 3 years ago

Description

Hi! I tried converting the image segmentation model(Yolact) to TensorRT. I used two methods, 1) pth->onnx->TensorRT and 2) using torch2trt, but I couldn't convert the model to one trt engine.

So I divided it into backbone, FPN, ProtoNet, PredHead, etc.(like YolactEdge)

In this case, how do I enqueue it for use in C++?

Is there any example or reference?

Yolact(
  (backbone): TRTModule()
  (proto_net): TRTModule()
  (fpn_phase_1): TRTModule()
  (fpn_phase_2): TRTModule()
  (prediction_layers): ModuleList(
    (0): PredictionModuleTRTWrapper(
      (pred_layer): TRTModule()
      (pred_layer_torch): PredictionModuleTRT(
        (upfeature): Sequential(
          (0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): ReLU(inplace=True)
        )
        (bbox_layer): Conv2d(256, 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conf_layer): Conv2d(256, 21, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (mask_layer): Conv2d(256, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (activation_func): Tanh()
      )
    )
    (1): PredictionModuleTRTWrapper(
      (pred_layer): TRTModule()
      (pred_layer_torch): PredictionModuleTRT(
        (upfeature): Sequential(
          (0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): ReLU(inplace=True)
        )
        (bbox_layer): Conv2d(256, 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conf_layer): Conv2d(256, 21, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (mask_layer): Conv2d(256, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (activation_func): Tanh()
      )
    )
    (2): PredictionModuleTRTWrapper(
      (pred_layer): TRTModule()
      (pred_layer_torch): PredictionModuleTRT(
        (upfeature): Sequential(
          (0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): ReLU(inplace=True)
        )
        (bbox_layer): Conv2d(256, 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conf_layer): Conv2d(256, 21, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (mask_layer): Conv2d(256, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (activation_func): Tanh()
      )
    )
    (3): PredictionModuleTRTWrapper(
      (pred_layer): TRTModule()
      (pred_layer_torch): PredictionModuleTRT(
        (upfeature): Sequential(
          (0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): ReLU(inplace=True)
        )
        (bbox_layer): Conv2d(256, 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conf_layer): Conv2d(256, 21, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (mask_layer): Conv2d(256, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (activation_func): Tanh()
      )
    )
    (4): PredictionModuleTRTWrapper(
      (pred_layer): TRTModule()
      (pred_layer_torch): PredictionModuleTRT(
        (upfeature): Sequential(
          (0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): ReLU(inplace=True)
        )
        (bbox_layer): Conv2d(256, 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conf_layer): Conv2d(256, 21, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (mask_layer): Conv2d(256, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (activation_func): Tanh()
      )
    )
  )
  (semantic_seg_conv): Conv2d(256, 6, kernel_size=(1, 1), stride=(1, 1))
  (lat_layer): TRTModule()
)

Environment

TensorRT Version: 7.2.3.4 NVIDIA GPU: gtx1060 NVIDIA Driver Version: 440.33.01 CUDA Version: 10.2 CUDNN Version: 8.1.1 Operating System: Ubuntu 18.04 Python Version (if applicable): 3.6.9 Tensorflow Version (if applicable): PyTorch Version (if applicable): 1.7.0 Baremetal or Container (if so, version):

Relevant Files

Steps To Reproduce

ttyio commented 3 years ago

Hello @sinuku , Here is an example that we load and enqueue multiple part of the same network in sequential order:

https://github.com/NVIDIA/TensorRT/blob/release/7.1/demo/Tacotron2/trt/inference_trt.py

ttyio commented 3 years ago

closing since no activity for more than 3 weeks, please reopen if you still have question, thanks