triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.39k stars 1.49k forks source link

fasterrcnn_resnet50_fpn TorchScript model cannot be loaded #1313

Closed adamm123 closed 4 years ago

adamm123 commented 4 years ago

Description A clear and concise description of what the bug is.

Cannot load FasterRCNN model exported to TorchScript using Triton r20.03.

Triton Information What version of Triton are you using?

nvcr.io/nvidia/tritonserver:20.03-py3

Are you using the Triton container or did you build it yourself?

container

To Reproduce Steps to reproduce the behavior.

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).

ENVIRONMENT:

- torch==1.4.0
- torchvision==0.5.0
- driver version: 440.64
- CUDA version: 10.2   
- os: Ubuntu 18.04.4 LTS
  1. Export FasterRCNN model:
    
    import torch
    import torchvision

model = torchvision.models.detection.fasterrcnn_resnet50_fpn() model = model.eval() script = torch.jit.script(model) script.save('model.pt')

2. Prepare repository and copy model into it:

mkdir -p models/object_detection/1 cp model.pt models/object_detection/1


3. Run Triton(replace `<path_to_model_repository>` with correct path):

nvidia-docker run --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v//models:/models nvcr.io/nvidia/tritonserver:20.03-py3 trtserver --model-repository=/models --log-verbose=1 --strict-model-config=false

Running server results in following error:

model_repository_manager.cc:840] failed to load 'object_detection' version 1: Internal: load failed for libtorch model -> 'object_detection': Unknown builtin op: torchvision::_new_empty_tensor_op. Could not find any similar ops to torchvision::_new_empty_tensor_op. This op may not exist or may not be currently supported in TorchScript.



**Expected behavior**
A clear and concise description of what you expected to happen.

Triton can run inference of a fasterrcnn_resnet50_fpn TorchScript model.

***
Which version of Pytorch is used in `nvcr.io/nvidia/tritonserver:20.03-py3`?
According to [release notes](https://docs.nvidia.com/deeplearning/sdk/inference-release-notes/rel_20-03.html#rel_20-03) it is Pytorch 1.3.0.
However, docker file seems to be using Pytorch image [nvcr.io/nvidia/pytorch:20.03-py3](https://github.com/NVIDIA/triton-inference-server/blob/v1.12.0/Dockerfile#L32) which in turn, according to its [release docs](https://docs.nvidia.com/deeplearning/frameworks/pdf/PyTorch-Release-Notes.pdf), is based on Pytorch 1.5.0.
If it is Pytorch 1.3.0 after all, when can be a Triton release with upgraded Pytorch version expected?
CoderHam commented 4 years ago

Yes Tritonserver 20.03 does indeed use Pytorch 1.5.0 from nvcr.io/nvidia/pytorch:20.03-py3. The docs will be fixed to reflect the same. Thank you for pointing that out.

I would recommend using the Tritonserver 20.01 (or 19.12) container since they are based on Pytorch 1.4.0 and libtorch is known to have brittle backward compatibility between some versions. It is likely the op in question here was modified or removed in 1.5.0