Jamie-Stirling / RetNet

An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
MIT License
1.14k stars 99 forks source link

Chunkwise real #13

Closed Jamie-Stirling closed 11 months ago

Jamie-Stirling commented 11 months ago

Implemented chunkwise retention paradigm.