issues
search
lucidrains
/
rela-transformer
Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012
MIT License
49
stars
7
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
LayerNorm/GatedRMS inconsistency
#1
inspirit
opened
2 years ago
6