Closed linnlh closed 1 week ago
@ChenYi015 @cheyang @Syulin7
@linnlh Please run the following commands to download the go module into the vendor package.
go mod tidy
go mod vendor
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: cheyang
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Purpose of this PR
This PR introduces a new serving type called
distributed
to Arena's serving module. The primary motivation behind these changes is to enable the deployment of large-scale models across multiple nodes within a Kubernetes (K8s) cluster.Proposed changes:
distributed
to Arena's serving module which can deploy model across multiple nodes.distributed
serving type.Which issue(s) this PR fixes: Fixes #1186
Change Category
Rationale
The
distributed
serving type addressed the increasing demand for multi-host inference due to the advancement of large language models (LLMs) such as Meta's Llama-3.1-405B. Currently, Arena lacks the capability to deploy models distributed across multiple nodes, and this PR aims to fill the gap.