triton-inference-server / client

Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.
BSD 3-Clause "New" or "Revised" License
520 stars 225 forks source link

MultiLoRA Support #662

Closed IzzyPutterman closed 1 month ago

IzzyPutterman commented 1 month ago

Adds profiling mutli LoRA support. The model parameter now takes a list.

Previous usage: genai-perf -m model_A .. New usage: genai-perf -m model_A model_B ..