triton-inference-server / pytriton

PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.
https://triton-inference-server.github.io/pytriton/
Apache License 2.0
687 stars 45 forks source link

onnx and tensorrt model supported? #57

Closed oreo-lp closed 5 months ago

oreo-lp commented 5 months ago

Does pytrition support ONNX and TensorRT ?

piotrm-nvidia commented 5 months ago

Thank you for your interest in PyTriton. PyTriton does not directly use any machine learning framework, but it allows you to run any Python code with any framework of your choice, such as PyTorch, TensorFlow, JAX. ONNX and TensorRT provide Python bidings so you can run such models directly from Python code. PyTriton does not directly handle the conversion of models to ONNX or TensorRT formats and doesn't support any bindings for these formats.

One of the tools you can use is Triton Model Navigator, which automates the process of model optimization and deployment on the Triton Inference Server. Triton Model Navigator handles model export, conversion, correctness testing, and profiling to select the optimal model format and save the generated artifacts for inference deployment. Triton Model Navigator has bindings for PyTriton, ONNX, and TensorRT, which means you can use its APIs to interact with these components.

To learn more about how to use PyTriton with ONNX and TensorRT models, you can check the following resources:

I hope this helps. Please let me know if you have any other questions.

github-actions[bot] commented 5 months ago

This issue is stale because it has been open 21 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 5 months ago

This issue was closed because it has been stalled for 7 days with no activity.