WukLab / preble

Stateful LLM Serving
Apache License 2.0
38 stars 6 forks source link

Add mixed workloads #60

Closed dongmingli-Ben closed 7 months ago

dongmingli-Ben commented 7 months ago

This PR add experiments for mixed workloads. The config for mixed workload is multi_node/benchmarks/multi_exp_configs/e2e_mix_config.py. It supports mixing one or multiple workload together with the same request rate.

dongmingli-Ben commented 7 months ago

Add small fixes to gracefully shutdown the RequestRateManager.