triton-inference-server / client

Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.
BSD 3-Clause "New" or "Revised" License
527 stars 225 forks source link

Move GenAI-Perf profiling to its own subcommand #745

Closed dyastremsky closed 2 weeks ago

dyastremsky commented 2 weeks ago

With compare being moved into a subcommand, there should be a profile subcommand for profiling with GenAI-Perf. Currently, those args just float so genai-perf compare <args> is used for compare but genai-perf <args> does not. This is inconsistent and can lead to special handling or errors with parsing.

With this change, genai-perf <profile> is the sole entryway for profiling models via GenAI-Perf. All documentation and tests are cleaned up as well.