lucidrains / CoLT5-attention

Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch
MIT License
224 stars 13 forks source link