Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M context keypass retrieval
64
stars
5
forks
source link
missing import of apply_rotary_pos_emb #2
Open
winglian opened 6 months ago