triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.35k stars 1.49k forks source link

Loading TorchScript model fails for Triton in DeepStream #2317

Closed rbrigden closed 3 years ago

rbrigden commented 3 years ago

Description

I am trying to load a successfully exported TorchScript model in the Triton inference server that is packed with DeepStream 5.0. Unfortunately I receive this error:

Internal: load failed for libtorch model -> 'mymodel': version_ <= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED at ../caffe2/serialize/inline_container.cc:132, please report a bug to PyTorch. Attempted to read a PyTorch file with version 4, but the maximum supported version for reading is 2. Your PyTorch installation may be too old.

Issues filed in pytorch seem to mismatched pytorch versions (between the export and runtime): example

The Pytorch/Torchvision environments in my training / export environment are:

torch==1.7.0
torchvision==0.8.1

I am using the NGC container

nvcr.io/nvidia/deepstream:5.0.1-20.09-triton

Based on the framework support matrix, it appears that 20.09 support PyTorch 1.7.0.

Triton Information What version of Triton are you using?

nvcr.io/nvidia/deepstream:5.0.1-20.09-triton

Are you using the Triton container or did you build it yourself?

I am using the NGC container

To Reproduce Steps to reproduce the behavior.

Export a TorchScript model using PyTorch 1.7.0 and adapt the triton sample in the container nvcr.io/nvidia/deepstream:5.0.1-20.09-triton /opt/nvidia/deepstream/deepstream-5.0/sources/python/apps/deepstream-ssd-parser

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).


  platform: "pytorch_libtorch"
  name: "efficientdet"
  max_batch_size: 4
  input [
    {
      name: "INPUT_0"
      data_type: TYPE_FP16
      dims: [ -1, 3, 512, 512 ]
    }
  ]
  output [
    {
      name: "OUTPUT_0"
      data_type: TYPE_FP16
      dims: [ 4, 49104, 4 ]
    },
    {
      name: "OUTPUT_1"
      data_type: TYPE_FP16
      dims: [ 4, 49104, 1 ]
    },
    {
      name: "OUTPUT_2"
      data_type: TYPE_INT64
      dims: [ 4, 49104, 1 ]
    }
  ]
CoderHam commented 3 years ago

Please share your model repository structure. It looks like instead of using numeric versions for the model you directly placed 'mymodel' in the 'model_directory'. Please refer to the instructions here and reopen if needed.

rbrigden commented 3 years ago

@CoderHam I mount my model repo to /models in the container

My model repo looks like this:

+--efficientdet
|   +-- 1
     |   +-- model.pt
|   +-- config.pbtxt

If it helps, my DeepStream config is

infer_config {
  unique_id: 5
  gpu_ids: [0]
  max_batch_size: 4
  backend {
    trt_is {
      model_name: "efficientdet"
      version: 1
      model_repo {
        root: "/models"
        log_level: 2
        strict_model_config: true
      }
    }
  }

  preprocess {
    network_format: IMAGE_FORMAT_RGB
    tensor_order: TENSOR_ORDER_NONE
    maintain_aspect_ratio: 0
    normalize {
      scale_factor: 1.0
      channel_offsets: [0, 0, 0]
    }
  }

  postprocess {
    labelfile_path: "labels.txt"
    other {}
  }

  extra {
    copy_input_to_host_buffers: false
  }

  custom_lib {
    path: "/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_infercustomparser.so"
  }
}
input_control {
  process_mode: PROCESS_MODE_FULL_FRAME
  interval: 0
}
output_control {
  output_tensor_meta: true
}

(Also, I am unable to re-open this issue as I am not a collaborator on this repo, so I do hope you see this)

CoderHam commented 3 years ago

Could you try running the model directly inside nvcr.io/nvidia/tritonserver:20.09-py3. Triton does not manage the deepstream container and it would help narrow down the issue. Looks to me that the model does not load correctly because of an older pytorch version.

msalehiNV commented 3 years ago

@rbrigden Since this appears to be a DeepStream-related issue, I've spoken to their team and they recommended that you post your issue on the Deepstream SDK Board: https://forums.developer.nvidia.com/c/accelerated-computing/intelligent-video-analytics/deepstream-sdk/15

rbrigden commented 3 years ago

Thank you @CoderHam and @msalehiNV, I haven't yet had a chance to test on the standalone triton server, but will do that soon. I'll post an update here and also well as make a post on the Deepstream SDK Board.