Open Luodian opened 2 years ago
Hi Thanks for providing such a wonderful codebase.
I have seen and used the save & load in MoE on multiple GPUs, now I can save them on different ranks. But is there away to convert them to one model?
Say, I trained a 8 experts MoE on 8 GPUs, and now I want to do next stage inference on 1 GPUs.
Will you consider provide an example on doing so? or could you provide some ideas on how to implement it myself.
A dup request of #177. We are going to add some utility functions to help with this conversion.
thanks! I think it's worthy doing.
Hi Thanks for providing such a wonderful codebase.
I have seen and used the save & load in MoE on multiple GPUs, now I can save them on different ranks. But is there away to convert them to one model?
Say, I trained a 8 experts MoE on 8 GPUs, and now I want to do next stage inference on 1 GPUs.
Will you consider provide an example on doing so? or could you provide some ideas on how to implement it myself.