issues
search
jadore801120
/
attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
MIT License
8.78k
stars
1.97k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
How to export the trained chkpt network to onnx?
#172
ZhangDongyuCN
closed
3 years ago
5
train big data(8G)
#171
JoeCoding
opened
3 years ago
0
Bump tensorflow from 1.14.0 to 2.4.0
#170
dependabot[bot]
closed
3 years ago
1
Can't find model 'en'
#169
manhph2211
opened
3 years ago
2
Fix Two Potential Bugs, with Significant Accuracy Improvement
#168
huanghoujing
closed
3 years ago
2
Question About Attention Score Computation Process & Intuition
#167
rezhv
opened
3 years ago
0
why none pad mask is nedd
#166
helloworld729
opened
3 years ago
1
what is meaning of trg_pad_idx in label smoothing loss?
#165
fakerhbj
opened
3 years ago
0
wrong with the code!!!!!
#164
chenrxi
closed
3 years ago
0
SyntaxError: invalid syntax
#163
junzew
closed
3 years ago
1
what does n_head, d_model, d_k, d_v stands for?
#162
seyeeet
closed
3 years ago
1
Update your codes
#161
thechvarun
closed
3 years ago
1
Why decoding is needed during inference ?
#160
rajeevbaalwan
opened
4 years ago
0
How does the gradients flow in cal_loss function in train.py?
#159
InhyeokYoo
closed
4 years ago
0
Resuming Training
#158
kaiyon07
opened
4 years ago
5
Why the previous version train faster
#157
dwtenis
closed
4 years ago
1
raise ConnectionError(e, request=request)
#156
KrisLee512
opened
4 years ago
1
To make position embedding be implemented by PyTorch, and to support …
#155
zipzou
closed
1 week ago
0
Surprising PPL on WMT 17
#154
luffycodes
opened
4 years ago
0
d_k not equal to d_k gives issues
#153
luffycodes
closed
4 years ago
0
PPL on wmt - 17
#152
luffycodes
opened
4 years ago
0
It seems that the layer norm and pos ffn are not consistent with the paper?
#151
zwlanpishu
closed
4 years ago
1
Fix LayerNorm.
#150
tony2037
closed
4 years ago
3
masking is not complete
#149
JianBingJuanDaCong
opened
4 years ago
1
the src_mask.
#148
chenjun2hao
closed
4 years ago
1
Performance with default parameters looks completely off...
#147
JianBingJuanDaCong
opened
4 years ago
1
fix masking tensor
#146
MokkeMeguru
closed
4 years ago
1
Training on Custom Data
#145
kevaday
closed
4 years ago
1
n_position in positional encoding
#144
Tejaswini2612
opened
4 years ago
1
Now the model depends on specific preprocessing method too much
#143
ylmeng
opened
4 years ago
1
About Layernorm
#142
BUCTwangkun
closed
4 years ago
2
slow and inaccurate
#141
xiaoshingshing
opened
4 years ago
2
TypeError: tuple indices must be integers or slices, not tuple when translating
#140
liperrino
opened
4 years ago
1
Preprocess error
#139
ZhichaoOuyang
opened
4 years ago
6
shared embedding factor bug
#138
kaituoxu
closed
4 years ago
2
why use matmul to instead of bmm?
#137
kaituoxu
closed
4 years ago
2
WMT14 en-de
#136
zhao1402072392
opened
4 years ago
2
preprocess ERROR
#135
JingsenZhang
opened
4 years ago
4
About Position Embedding and mask
#134
Zessay
closed
4 years ago
4
Why bias=False in q, k, and v projection
#133
mertensu
opened
4 years ago
3
What I get from the default is very different from what you showed. Is it because of the code update?
#132
SmallSmallQiu
closed
4 years ago
6
Expected object of scalar type Bool but got scalar type Byte for argument #2 'other'
#131
SmallSmallQiu
closed
4 years ago
4
add performance result on IWSLT14 de-en dataset
#130
marvinzh
opened
4 years ago
0
update
#129
shaoxiaoyu
opened
4 years ago
0
AttributeError: 'Decoder' object has no attribute 'tgt_word_emb'
#128
HassanNaeemjutt
closed
4 years ago
1
new
#127
flwjt
closed
5 years ago
0
Error when training with -no_cuda
#126
wz337
closed
4 years ago
1
Question about get_the_best_score_and_idx in Beam.py
#125
yudmoe
closed
4 years ago
1
RuntimeError: DataLoader worker (pid 26604) is killed by signal: Killed.
#124
Ike-yang
closed
4 years ago
2
Where is the input to the decoder during training shifted by one?
#123
jonathanking
closed
5 years ago
1
Previous
Next