triton-inference-server / model_analyzer

Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.
Apache License 2.0
435 stars 75 forks source link

Invalid genai-perf command line for LLM model type #935

Open vlad-vinogradov-47 opened 1 month ago

vlad-vinogradov-47 commented 1 month ago

Hi!

I'm running model_analyzer from nvcr.io/nvidia/tritonserver:24.08-py3-sdk docker container for my model with LLM model type. It fails with the following error message:

Command: 
genai-perf -m my_model -- -b 1 -u server:8001 -i grpc -f my_model-results.csv --verbose-csv --concurrency-range 64 --measurement-mode count_windows --collect-metrics --metrics-url http://server:8002 --metrics-interval 1000

Error: 
2024-10-01 10:42 [INFO] genai_perf.parser:803 - Detected passthrough args: ['-b', '1', '-u', 'server:8001', '-i', 'grpc', '-f', 'my_model-results.csv', '--verbose-csv', '--concurrency-range', '64', '--measurement-mode', 'count_windows', '--collect-metrics', '--metrics-url', 'http://server:8002', '--metrics-interval', '1000']
usage: genai-perf [-h] [--version] {compare,profile} ...
genai-perf: error: argument subcommand: invalid choice: 'my_model' (choose from 'compare', 'profile')

It looks like the genai-perf command line created by model_analyzer missing required mode (genai-perf profile ...).

nv-braf commented 1 month ago

Yes, it seems that genai-perf has changed their CLI and now it requires profile. Going forward we are Can you please try adding profile to line 328 of perf_analyzer.py. It should now look like:

 cmd = ["genai-perf", "profile -m", self._config.models_name()]

Please let me know if that solves the issue for you. Thanks.