kyegomez AttentionIsOFFByOne issues - Githubissues

kyegomez / AttentionIsOFFByOne

Implementation of "Attention Is Off By One" by Evan Miller

MIT License

179 stars 9 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

docs(README): fix equation formatting

#7 YodaEmbedding closed 1 year ago
1
Is there any evidence that softmax one can takes advantages over normal softmax?

#6 ZGCTroy opened 1 year ago
0
IMPORTANT：The definition of the softmax one is wrong

#5 PhilIp-L-Good opened 1 year ago
4
Is there a test showing effectiveness of softmax1 removing outliers?

#4 immars opened 1 year ago
0
If you wanna make it fast, just use nn.softmax() and concatenate a zero.

#3 mcourteaux opened 1 year ago
1
how to solve the issue of overflow

#2 ZGCTroy opened 1 year ago
0
Discuss: `softmax_one` and `zero_vector` in QuietAttention conflicts?

#1 Kahsolt opened 1 year ago
1