Open dathudeptrai opened 1 year ago
@dathudeptrai Why do you want to run the inference using python backend? Would using TF backend in Triton not serve your problem? https://github.com/triton-inference-server/tensorflow_backend#build-the-tensorflow-backend-with-custom-tensorflow
@tanmayv25 I am about to build the triton sever docker that supports all backends and also able to custom preprocessing/postprocessing step. There are some stuffs that not easy to compile entirely into tensorflow graph like Text generation (beam search, ...). Also for some models, raw TF + XLA even faster than TF-TRT 1.5 - 2 times.
BTW, actually, I just need the wheel version of TF that match with nvcr.io/nvidia/tensorflow:23.04-tf2-py3.
for example :D. So that I can pull triton sever docker and pip install tensorflow.whl
:D
Description I want to run a custom TF2 (with TF-TRT) code in the
nvcr.io/nvidia/tritonserver_23.04-py3
. Basically, I want to install TF2 that aligns with this dockerhttps://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow
so that I can have TF2 able to run with CUDA 12.1 and the latest TensorRT 8.6.1.Maybe the best way is to build tritonsever with the base docker image is
23.04-tf2-py3
?, but how?Triton Information
nvcr.io/nvidia/tritonserver_23.04-py3