triton-inference-server / client

Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.
BSD 3-Clause "New" or "Revised" License
551 stars 227 forks source link

fix: Get validation outputs by name rather than index #697

Closed krishung5 closed 3 months ago

krishung5 commented 3 months ago

We were seeing L0_long_running_stress test failing due to an output doesn't match to the expected output.

Successfully read data for 1 stream/streams with 3 step/steps.
...
Inferences/Second vs. Client Average Batch Latency
Concurrency: 2, throughput: 3138.92 infer/sec, latency 600 usec
Thread [1] had error: Output doesn't match expected output

It seems like it's because the order of infer_data_.outputs_ and infer_data_.expected_outputs_ might be different, so we need to make sure that we are comparing the correct validation output based on the same output name. Shout out to @rmccorm4 who quickly came up with the solution! This PR is based on https://github.com/triton-inference-server/client/pull/685

krishung5 commented 3 months ago

@rmccorm4 That's great point! I'm not sure about the details so I'll let David or Matt to comment. It's passing the PA pipeline so I hope this doesn't break anything further.

tanmayv25 commented 3 months ago

Triton does not guarantee on the order in which outputs will be returned in the response. The order in which outputs appear in the response will depend upon the order in which the backend has added them. The name should be used as the identifier for retrieving the results.