How to deploy your own trained model in jetson-inference

dusty-nv / jetson-inference

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

https://developer.nvidia.com/embedded/twodaystoademo

MIT License

7.86k stars 2.98k forks source link

How to deploy your own trained model in jetson-inference #1785

Closed jzymessi closed 9 months ago

jzymessi commented 9 months ago

I have now trained a yolox model myself and converted it into a .trt fp16 model, which can be inferred in TensorRT. I now want to infer this model in jetson-inference, are there any reference examples?

dusty-nv commented 9 months ago

Hi @jzymessi, this repo doesn't explicitly support YOLOX, you would need to adapt the pre/post-processing routines in c/detectNet.cpp for it

It looks like upstream YOLOX already supports TensorRT though: https://yolox.readthedocs.io/en/latest/demo/trt_py_readme.html

jzymessi commented 9 months ago

Hi @dusty-nv , thank you for your reply. I roughly understand that I need to add YOLOX pre/post-processing in the detectNet.cpp. However, I am a bit confused. As far as I know, Jetson-Inference supports CaffeModel, UFF, and ONNX formats. Is UFF format the model format supported by TensorRT in the form of .trt or .engine?

dusty-nv commented 9 months ago

ONNX is mostly used these days. UFF was a holdover from TensorFlow, but I wouldn't use that going forward. Regardless, this isn't about which file format the model weights are in, but rather about what input/output tensors the model expects.

jzymessi commented 9 months ago

If I use an ONNX model for inference, will the inference run on the CPU or the GPU?

dusty-nv commented 9 months ago

This repo uses TensorRT for inference (TensorRT imports the ONNX) which uses GPU (or optionally DLA on Xavier/Orin)

jzymessi commented 9 months ago

Thanks, I already know how to deploy this model.