Open GeneZC opened 1 year ago
Hi,
Thanks for being interested in this repo !!!
Is there any experiments post that adding tensor parallelism or zero would improve training performance ?
Not really.
However, there are projects that use pipeline with tensor parallelism together for efficiency such like megatron. And I believe this project offers a better solution since it only depends on deepspeed without heavy dependencies as in megatron.
As for pipeline with zero, I have not seen any other projects did this.
Would it be possible in this framework that the pipeline is incorporated to tensor parallelism or zero data parallelism?