Closed MOSHIIUR closed 1 month ago
Hi, you mean training the model with another multi-node framework or without deepspeed at all even for single-node
Without deepspeed at all even for single-node.
for now we've only tried training the model with deepspeed. If you prefer to use other frameworks or vanilla pytorch, you may explore other llava implementations without deepspeed, like https://github.com/alibaba/Pai-Megatron-Patch, which should be compatible with CuMo.
Hi, As you have mentioned how we can use Deepspeed multi-node trainings to train the model on multiple nodes. I was curious if we can train the model without incorporating them at all?
https://github.com/SHI-Labs/CuMo/blob/main/docs/getting_started.md