triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.32k stars 1.48k forks source link

How can I install TF2 in the triton python_backend ? #5774

Open dathudeptrai opened 1 year ago

dathudeptrai commented 1 year ago

Description I want to run a custom TF2 (with TF-TRT) code in the nvcr.io/nvidia/tritonserver_23.04-py3. Basically, I want to install TF2 that aligns with this docker https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow so that I can have TF2 able to run with CUDA 12.1 and the latest TensorRT 8.6.1.

Maybe the best way is to build tritonsever with the base docker image is 23.04-tf2-py3?, but how?

Triton Information nvcr.io/nvidia/tritonserver_23.04-py3

tanmayv25 commented 1 year ago

@dathudeptrai Why do you want to run the inference using python backend? Would using TF backend in Triton not serve your problem? https://github.com/triton-inference-server/tensorflow_backend#build-the-tensorflow-backend-with-custom-tensorflow

dathudeptrai commented 1 year ago

@tanmayv25 I am about to build the triton sever docker that supports all backends and also able to custom preprocessing/postprocessing step. There are some stuffs that not easy to compile entirely into tensorflow graph like Text generation (beam search, ...). Also for some models, raw TF + XLA even faster than TF-TRT 1.5 - 2 times.

BTW, actually, I just need the wheel version of TF that match with nvcr.io/nvidia/tensorflow:23.04-tf2-py3. for example :D. So that I can pull triton sever docker and pip install tensorflow.whl :D