Closed anhquan075 closed 10 months ago
Hi @anhquan075. Thanks for creating a detailed issue report with all relevant information and screenshots to boot!
The Triton Inference server's gRPC interface returns the output tensor as raw bytes in the raw_output_contents
field for performance reason (I tried to find a good doc page describing this, but only found this issue comment). For each output tensor in outputs
there will be an entry in raw_output_contents
with a base64 encoded string of the bytes of the raw data for the tensor. The outputs
metadata tells you the shape and datatype that you then need to parse from the bytes. For your example, the output is 12 32bit floats that will need to be parsed out.
Here's an example in Go from our FVTs where we do this parsing for our test request: https://github.com/kserve/modelmesh-serving/blob/b5affffb642a7b65877e50923b29db24f09f8265/fvt/inference.go#L374-L380
The Triton Inference Server Client could also be used to do this output post-processing.
Let me know if this helps or if it doesn't resolve your issue!
@tjohnson31415 thank you for the response. I will try to parse the bytes content to get the output. Btw, I also found a way to fix it, by using the grpclient
in tritonclient
python sdk.
Describe the bug Triton Inference Server doesn't return any values when using the gRPC protocol. This issue is specifically observed in gRPC requests; however, the Restful protocol works as expected. Additionally, the gRPC protocol of
mlserver
also returns values as expected.To Reproduce Steps to reproduce the behavior:
isvc
use Triton Inference ServerServingRuntime
with gRPC protocol.Expected Behavior When making inference requests using the gRPC protocol,
Triton Inference Server
should return the expected values, similar to how it behaves with Restful protocol and the gRPC protocol ofmlserver
.Actual Behavior The server does not return any values for inference requests made using the gRPC protocol with
Triton Inference Server
.Screenshots My
curl
example with RESTful protocol withTriton Inference Server
Meanwhile, my
grpcurl
for the gRPC protocol withTriton Inference Server
:When I run a model using mlserver runtimes via gRPC, I receive the desired outputs as intended.
Environment (please complete the following information):
v1.25.5