How to inference with model converted by MMdeploy?

Monalsingh commented 2 years ago

Hi,

I am trying to use MMpose in the Nvidia triton server but it does not support PyTorch model, it supports torchscript and ONNX, and a few others. So, I have converted MMpose mobilenetv2 model to ONNX using MMdeploy.

My questions are: 1) How to use the converted (ONNX) model in the MMpose framework? 2) Triton uses its own way to inference the model Example: triton_client.infer(model_name,model_version=model_version, inputs=input, outputs=output) MMdeploy uses its own way to inference the model: Example: from mmdeploy_python import PoseDetector detector = PoseDetector( model_path=args.model_path, device_name=args.device_name, device_id=0) How am I suppose the load the model using Triton way and not using PoseDetector function by mmdeploy?

I am stuck here from long time. bodypose_triton

lvhan028 commented 2 years ago

I am not an NVIDIA triton export. Maybe @austinmw can answer this question. Hi, @austinmw, would you mind?

austinmw commented 2 years ago

Hi, I'm far from an expert, but I'd say that, from this documentation, MMDeploy supports TensorRT exporting for 3 different MMPose models (HRNet, MSPN, LiteHRNet). So I would choose one of them and export to TensorRT. Then create a model repository directory that includes the tensorrt model engine file and a config.pbtxt which specifies the input and output shapes. Then finally you can run tritonserver --model-repository=/models

willcray commented 3 months ago

@austinmw I'm interested in deploying mmdetection models to Triton. Is TensorRT the only backend that you found to be supported? Am I not able to use torchscript / libtorch?

open-mmlab / mmdeploy

How to inference with model converted by MMdeploy? #855