WukLab / preble

Stateful LLM Serving
Apache License 2.0
38 stars 6 forks source link

Is the router necessarily a centralized single node? #76

Open SpecialYang opened 3 weeks ago

SpecialYang commented 3 weeks ago

From the doc https://docs.google.com/document/d/1cCqK3dh7ZR_rUPkcZT2cr0kLnAxv6_Sd-P1q37-3RNQ/edit?tab=t.0.

Is the router necessarily a centralized single node?

If not, how can multiple replicas of the router maintain consistent queues and approx trees?

vikranth22446 commented 7 hours ago

I'm not working on this specific doc. This is maintained by sglang team. However, you can probably use other distributed systems techniques in order to avoid using a centralized single node.

I do think with an efficient implementation you can probably scale up a single node pretty well.