issues
search
fkodom
/
yet-another-retnet
A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (https://arxiv.org/pdf/2307.08621.pdf)
MIT License
100
stars
15
forks
source link
Bug fix: decay mask for bf16, bf32
#25
Closed
fkodom
closed
12 months ago