triton-inference-server / client

Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.
BSD 3-Clause "New" or "Revised" License
517 stars 224 forks source link

Update GenAI-Perf metric unit assignment to avoid overwrites #721

Closed dyastremsky closed 1 week ago

dyastremsky commented 1 week ago

This PR updates the unit assignment for metrics in GenAI-Perf. It switches from if branching to if/elif/else to avoid overwrites with unexpected behavior. This should make no difference other than an efficiency in gain in the expected cases, but this will act as a stopgap to avoid unexpected behavior if a metric meets the criteria of multiple branches.

Long-term, we want to identify a consistent way to categorize metrics that makes it impossible for a metric to match the criteria of multiple branches.

Note: For ease, this PR will be rebased off of https://github.com/triton-inference-server/client/pull/715 once https://github.com/triton-inference-server/client/pull/719 is merged. It will not be merged until then. Rather than run CI for this small change, it will be easier to incorporate it into the CI run for the larger feature branch. This also avoids any potential conflicts in putting this directly in main with these lines of code being updated in #719 or #715.