ray-project / ray-llm

RayLLM - LLMs on Ray
https://aviary.anyscale.com
Apache License 2.0
1.22k stars 87 forks source link

Is there a way to increase the scaling up speed? #99

Open rifkybujana opened 9 months ago

rifkybujana commented 9 months ago

As the title suggest, and from what i've experienced, vllm is slower than TGI in term of loading model, is there a way to optimize it? As of right now, it takes 1-2 minutes to scale-up instances using AWS G5 instance.

akshay-anyscale commented 8 months ago

What models are you using?

rifkybujana commented 8 months ago

Llama 7b quantized with awq