Open kabachuha opened 4 months ago
Not tested yet, as it needs training runs, but I think it can be helpful for you
ReBased repo: https://github.com/corl-team/rebased
Thanks for your PR, we will check it.
Be aware that there could be some problem in the case you want to compile it https://github.com/pytorch/pytorch/issues/121386
ring attention is not the most efficient method for this model.
Not tested yet, as it needs training runs, but I think it can be helpful for you
ReBased repo: https://github.com/corl-team/rebased
Ring Attention implementation by Lucidrains https://github.com/lucidrains/ring-attention-pytorch