triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
7.82k stars 1.42k forks source link

Inquiry Regarding Triton Inference Server and PyTorch Integration #6998

Open luvpine opened 4 months ago

luvpine commented 4 months ago

Hello.

I am writing to inquire about the PyTorch version used in the Triton Inference Server 24.01 release.

Upon reviewing the documentation, I noticed that Triton 24.01 includes PyTorch version 2.2.0a0+81ea7a4. However, I observed that the official PyTorch 2.2.0 release's latest commit is 8ac9b20.

I was curious about the reasoning behind not using the latest commit of the official PyTorch version in Triton. I would greatly appreciate any insights you could provide regarding this decision. Or, I wanted to ensure that using the Triton Inference Server 24.01 release without the full sources of the official PyTorch codebase would not cause any issues or limitations.

Furthermore, I wanted to inquire about the possibility of building the PyTorch backend with the official PyTorch codebase. After reviewing the Triton backend documentation, I am unsure if this is feasible. I would be grateful if you could clarify whether this is an option and, if so, provide any guidance on how to proceed.

I appreciate any information or assistance you can offer.

indrajit96 commented 4 months ago

@luvpine Thanks a lot for reaching out. CC @nv-kmcgill53 @rmccorm4 @mc-nv

mc-nv commented 4 months ago

I am writing to inquire about the PyTorch version used in the Triton Inference Server 24.01 release.

Upon reviewing the documentation, I noticed that Triton 24.01 includes PyTorch version 2.2.0a0+81ea7a4. However, I observed that the official PyTorch 2.2.0 release's latest commit is 8ac9b20.

I was curious about the reasoning behind not using the latest commit of the official PyTorch version in Triton. I would greatly appreciate any insights you could provide regarding this decision. Or, I wanted to ensure that using the Triton Inference Server 24.01 release without the full sources of the official PyTorch codebase would not cause any issues or limitations.

If you observing functional limitation please use newer version. Triton internal release preparation take place before public release, and code freeze may take place earlier, same for PyTorch. Please review following documentation for additional details to understand pre-rerelease build process . https://peps.python.org/pep-0440/

Furthermore, I wanted to inquire about the possibility of building the PyTorch backend with the official PyTorch codebase. After reviewing the Triton backend documentation, I am unsure if this is feasible. I would be grateful if you could clarify whether this is an option and, if so, provide any guidance on how to proceed.

Triton is open-source product, please feel free to made modification as per your needs, but we can't guarantee success or provide support against it.

Triton is a part of NVIDIA Optimized Framework and development is strict to defined libraries: https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html

luvpine commented 4 months ago

@mc-nv

Thank you for your prompt response.

To summarize, are you advising against rebuilding the Triton backend using the PyTorch official release code? Or is it not a structure where you can easily succeed by simply placing the code in a specific location and building it? If it is relatively easy to rebuild the Triton backend by modifying only the code in a specific location, I would appreciate any tips you can provide. I am inquiring about this because the official Triton documentation does not contain detailed explanations.

Thank you for your assistance in advance.