issues
search
lucidrains
/
nGPT-pytorch
Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI
MIT License
201
stars
10
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
shouldn't embeding be normalized along embed dimension?
#9
jfpuget
opened
6 hours ago
1
None
#8
ghgmail
closed
2 days ago
0
why the initial value of alpha_init is 1/layer_num
#7
ghgmail
closed
4 days ago
2
What is the difference between nGPT.py and nTransformers.py
#6
ghgmail
closed
4 days ago
1
Hypersphere assets gallery
#5
MeDott29
closed
5 days ago
0
differences between nGPT and nTransformers
#4
rotem154154
closed
5 days ago
10
Rotary embedding exclusive for each Attention layer?
#3
inspirit
opened
1 week ago
3
l2norm not used
#2
zxytim
closed
1 week ago
1
Parametrize
#1
faresobeid
closed
2 weeks ago
31