issues
search
Jamie-Stirling
/
RetNet
An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
MIT License
1.14k
stars
98
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Is Retnet equivalent to ordinary GPT when the decay is set to 1 ?
#37
xuanyaoming
opened
5 months ago
3
Dimensions of forward_recurrent
#36
Qiu30
closed
7 months ago
5
a question about xpos and D of decay mat
#35
DavideHe
opened
7 months ago
2
Confusion about "the chunkwise recurrent representation of retention"
#34
CHENHUI-X
opened
8 months ago
0
Can this mechanism be applied to PointCloud data ?
#33
madjid-dx
opened
8 months ago
0
NO LM HEAD
#32
shnuhw
closed
8 months ago
2
Fix dimension mismatch when hidden size is odd
#31
ilunye
closed
8 months ago
0
How to predict use this net?
#30
GodPCWANG
opened
9 months ago
0
Faster implementation of MultiScaleRetention, adds dependency on einops
#29
draguve
closed
9 months ago
1
The complex theta should cancel out
#28
albertbuchard
opened
9 months ago
0
/src/retnet.py GPU
#27
Qiu30
closed
9 months ago
2
Fix math problem in gamma calculation
#26
Jun-depo
closed
6 months ago
1
Assistance on training a new retention network model ?
#25
risedangel
opened
9 months ago
0
Passing Attention Masks
#24
leffff
opened
9 months ago
3
Update retention.py
#23
leffff
closed
9 months ago
1
Q, k and D device difference
#22
leffff
closed
9 months ago
1
Proposed improvement/collaboration: removing the O(T^2) training cost
#21
jackd
closed
9 months ago
2
Fixed typo
#20
EgoVeroConsisto
opened
10 months ago
0
can retnet be applied in point cloud tasks?
#19
huiyang0613
opened
10 months ago
0
Changelog of official implementation
#18
donglixp
opened
11 months ago
2
what about cross-attention
#17
aki819
opened
11 months ago
0
Error when the model is running on GPU
#16
SSamDav
opened
11 months ago
1
Update src/complex/retention.py
#15
MichaelFu1998-create
closed
11 months ago
1
demo example / number of parameter control vs original code
#14
thegodone
opened
11 months ago
4
Chunkwise real
#13
Jamie-Stirling
closed
11 months ago
0
Can you make this repo in available for package installers (pip)?
#12
gaasher
opened
11 months ago
0
RetNet Officially Released
#11
tiendung
closed
11 months ago
1
Chunkwise retention giving different output
#10
Jamie-Stirling
closed
11 months ago
4
Minor docs fix
#9
Regenhardt
closed
11 months ago
0
_get_D function very slow for long sequence
#7
ZuowenWang0000
closed
11 months ago
1
Real-valued implementation using xPos
#6
Jamie-Stirling
closed
11 months ago
0
Real-valued implementation using xPos
#5
Jamie-Stirling
closed
11 months ago
0
Training is slow and some errors (perhaps)
#4
Zth9730
closed
11 months ago
6
Initial effort to add chunkwise retention paradigm
#3
Aaryanverma
closed
11 months ago
1
Some Questions about Attention Mask
#2
tang-ed
closed
11 months ago
3
About the complex
#1
KohakuBlueleaf
closed
11 months ago
4