Does this work deal with the workload balance when scheduing?

AjayP13 commented 3 months ago

I assume you're talking about this: https://github.com/vllm-project/vllm/issues/1237#issuecomment-2017239455

But yes, it does balance the work between multiple instances of the vLLM model.

andakai commented 3 months ago

Yes, I am trying to do dataparallel using vLLM based on your project. I have read the guide, but still feel confused about how to do this. I wonder if there is any specific doc or example?

AjayP13 commented 3 months ago

You can see this test case right here (test_parallel_llm) for some example code:

https://github.com/datadreamer-dev/DataDreamer/blob/ad3dd9ce73e8ad66f3c1b124686ea970e347343a/src/tests/trainers/test_distributed.py#L131

That test case will run a two copies of an LLM (one per GPU on a 2-GPU machine) using Hugging Face Transformers.

If you want to use VLLM instead of HFTransformers instead, it's as simple as:

from datadreamer.llms import VLLM, ParallelLLM
llm_1 = VLLM("gpt2", device=0)
llm_2 = VLLM("gpt2", device=1)
parallel_llm = ParallelLLM(llm_1, llm_2)

andakai commented 3 months ago

wow this is so easy to use. It helps me a lot. Thanks for your fantastic work.

AjayP13 commented 3 months ago

No problem, let me know if you need any other help!

datadreamer-dev / DataDreamer

Does this work deal with the workload balance when scheduing? #17