issues
search
microsoft
/
torchscale
Foundation Architecture for (M)LLMs
https://aka.ms/GeneralAI
MIT License
2.98k
stars
201
forks
source link
How to use retention in RetNet for cross-attention?
#101
Open
yxchng
opened
4 months ago