Same model but different results between triton and native tensorrt engine

triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html

BSD 3-Clause "New" or "Revised" License

7.97k stars 1.44k forks source link

Same model but different results between triton and native tensorrt engine #3452

Closed HoangTienDuc closed 1 year ago

HoangTienDuc commented 2 years ago

Description

I have 1 classification model. i tried to deploy it using both triton and tensorrt engine. It is very suprise for me that these two deploy method give me different results. Same preprocessing, same postprocessing. The difference is only using different frame work.

Triton Information Tensorrt 7.2.1 - triton ngc 20.11-py3 - dGPU rtx 2080ti

Are you using the Triton container or did you build it yourself? Triton container To Reproduce Tensorrt engine code To avoid very long issue, i push my code on gist. Please help me to take a look

Expected behavior It is just my test on tensorrt and triton. Hope to get the same output results from these two method

stellaywu commented 2 years ago

Hi @HoangTienDuc Wondering if you've figured out the issue and if you could share some thoughts. I'm seeing a similar issue with Triton PyTorch backend serving having different result compared to the native PyTorch inference on a Detectron2 model. Thanks!

CoderHam commented 2 years ago

@HoangTienDuc @stellaywu did you run these for multiple or single instances? Were the framework versions and the models the same?

HoangTienDuc commented 2 years ago

@HoangTienDuc @stellaywu did you run these for multiple or single instances? Were the framework versions and the models the same?

Yes, the framework version and model are the same. For test only, so i ran Triton in single instance.

HeeebsInc commented 1 year ago

@HoangTienDuc did you find a solution to this? I also have a similar issue when comparing inference between deepstream, triton, and pure pytorch (and within each version). Deepstream 6.1 and deepstream 6.2 give different results, and triton 23.05 and triton 22.05 give different results

dyastremsky commented 1 year ago

Closing this issue due to its age and inactivity. Triton has changed a lot in the last two years. We have also run validation on Triton results for TensorRT.

For any issues in Triton's recent releases, please open a new issue following the bug template. We need clear steps to reproduce your issue, including models and filed needed. If you cannot share your model, feel free to use a sample model which reproduces the same issue.