issues
search
sustcsonglin
/
flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
MIT License
1.24k
stars
66
forks
source link
[DRAFT] Beta gradient does not match
#43
Closed
hypnopump
closed
1 month ago
hypnopump
commented
1 month ago
[x] Modify test to cover beta vector test
[ ] gradient does not match. fix bug?