Closed helenHlz closed 2 years ago
Hi @helenHlz,
If this is a TFServing issue, please reach out to the TFServing folks, looks like you already have here: https://github.com/tensorflow/serving/issues/2021
If you can reproduce a similar issue serving your model with Triton, then please open a new issue here.
Description I use tf serving to deploy a tf-trt int8 optimization model on a t4 nvidia card. Then I got this bug "Assertion `batchSize > 0' failed".
This is the log:
When I use the same tf-trt int8 optimization model on offline prediction, it works fine.
The strange thing is the deploying of tf-trt FP16 optimization model works fine.
Here is the log:
Triton Information What version of Triton are you using? Tensorflow 1.15.0 TensorRT 5.1.5
To Reproduce I can upload some code if needed
Expected behavior I'm just confused why this problem occurs when deploying, since the int8 model offline inference could work and the fp16 model deploying have no problem. I found someone else had this problem too, but this answer didn't help me solve the problem. https://github.com/triton-inference-server/server/issues/550