triton-inference-server / client

Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.
BSD 3-Clause "New" or "Revised" License
527 stars 225 forks source link

Support custom artifacts directory and improve default artifacts directory #636

Closed nv-hwoo closed 2 months ago

nv-hwoo commented 2 months ago
  1. Allow custom artifacts path

Users can set their own artifacts path:

$ genai-perf ... --artifact-dir custom/path ...

custom
└── path
    ├── all_data.gzip
    ├── llm_inputs.json
    ├── plots
    │   ├── ...
    ├── profile_export_genai_perf.csv
    └── profile_export.json
  1. Improve default artifacts directory

Before

artifacts
├── data
│   ├── all_data.gzip
│   ├── input_tokens_vs_generated_tokens.gzip
│   ├── llm_inputs.json
│   ├── profile_export_genai_perf.csv
│   ├── profile_export.json
│   ├── request_latency.gzip
│   ├── time_to_first_token.gzip
│   ├── token_to_token_vs_output_position.gzip
│   └── ttft_vs_input_tokens.gzip
└── plots
    └── ...

This was a problem because artifacts will be overwritten even when the load mode (or level) changed, endpoint type changed, service kind changed, and etc.

After

artifacts
└── gpt2-openai-chat-concurrency1  // {model}-{service_kind}-{backend/endpoint-type}-{load mode}{load level}
    ├── all_data.gzip
    ├── llm_inputs.json
    ├── plots
    │   ├── ...
    ├── profile_export_genai_perf.csv
    └── profile_export.json