Open shaoxuefeng opened 1 year ago
i have started with hostfile on only node(my machine has 2 gpu, but i only can deploy on one gpu). configs: tensor_parallel: 1 deploy_rank: 0 other params are the same as yours my hostfile's content: 127.0.0.1 slots=2
and, by the way, u need to set your node passwordless login itself by ssh. i want to know how to deploy on one node with multi gpu?
i find some detail in source script.
it will make your "enable_load_balancing" become True.Then server.py:
it will write a temp file with your "replica_num" but not your hostfile. you can comment line 5, 6 and rewite line 17 to mii_configs.hostfile. And "tensor_parallel" must be equal to length of parameter "deploy rank" in mii_configs. Hope that will be helpful
Follow the README doc, I would like to deploy a RESTful API on one node,
But I got a
ValueError: No slot '1' specified on host 'localhost'
error: the deploy python code :And the hosfile:
According to the Deepspeed Issue, it seems we can't start with hosftile on only node. I even update deepspeed pkg to lastest master version, but it still not work.
So, How can i start a a RESTful API deepspeed MII on one node? Thank you!