dingo-actual / infini-transformer

PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)
MIT License
280 stars 23 forks source link

fix sigma_k usage in update and delay it #1

Closed amitportnoy closed 6 months ago