may I ask if it's possible to deploy the model to multiple machines separately and then have the workers on these machines point to the same controller?

lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Apache License 2.0

36.72k stars 4.53k forks source link

may I ask if it's possible to deploy the model to multiple machines separately and then have the workers on these machines point to the same controller? #2923

Closed yigediaosi closed 9 months ago

yigediaosi commented 9 months ago

Because I want to deploy approximately more than 10 models for user use. If all are deployed on a single machine, I am concerned about low prediction efficiency, and besides, our individual machine has limited space.

surak commented 9 months ago

Yes, it's absolutely possible. You just need to tell it on the controller address and port. I do it on my clusters.

In fact, I even use slurm to manage them, so I don't care on which compute node each model runs.

You can check the slurm files on my fork of FastChat, https://github.com/HelmholtzAI-FZJ/FastChat/tree/main

yigediaosi commented 9 months ago

Thank you for your reply.