jadore801120 attention-is-all-you-need-pytorch issues

jadore801120 / attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

MIT License

8.78k stars 1.97k forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

python preprocess.py -lang_src de -lang_trg en -save_data multi30k_de_en.pkl -share_vocab have problem

#222 yyydfff opened 2 months ago
1
Create app.py

#221 Esmail-ibraheem opened 5 months ago
0
Create Embedding.py

#220 Esmail-ibraheem opened 5 months ago
0
Create transformer.py

#219 Esmail-ibraheem opened 5 months ago
0
when I run the `python preprocess.py -lang_src de -lang_trg en -share_vocab -save_data m30k_deen_shr.pkl`.I have faced a problem

#218 dapaolufuduizhang opened 8 months ago
1
About target mask

#217 KimRass opened 8 months ago
0
Error time to execute 'python preprocess.py -lang_src de -lang_trg en -share_vocab -save_data m30k_deen_shr.pkl' command

#216 evilczy opened 10 months ago
3
根据requirements.txt安装出错

#215 kmphuang opened 12 months ago
4
[feat] 添加了一些阅读代码的注释

#214 SS4G opened 1 year ago
0
why is masking performed again during the inference decoder stage?

#213 Akshay1-6180 opened 1 year ago
0
e

#212 Anton293 closed 1 year ago
0
Possible mistakes in d_k, d_v of MultiheadAttention

#211 SARIHUST opened 1 year ago
0
May I ask you a question about the "scale_emb_or_prj" parameter?

#210 aitch25 opened 1 year ago
0
Performance Confusion

#209 Zarca opened 1 year ago
0
Is there some way to convert .chkpt to other form model such as onnx?

#208 warren-wzw opened 1 year ago
0
Replace NumPy with PyTorch in PositionalEncoding

#207 ZYM66 opened 1 year ago
0
Bump tensorflow from 1.14.0 to 2.11.1

#206 dependabot[bot] opened 1 year ago
0
In patch_trg, i cant understand why do you change the data shape like that

#205 kwanhoP opened 1 year ago
3
ValueError: Cell is empty

#204 Kznnd opened 1 year ago
4
Bump tensorflow from 1.14.0 to 2.9.3

#203 dependabot[bot] closed 1 year ago
1
preprocess error

#202 zhoup150344 opened 1 year ago
6
what's the intuition behind getting q, k and v from embedding

#201 ShouravBR closed 2 years ago
1
Bump tensorflow from 1.14.0 to 2.7.2

#200 dependabot[bot] closed 1 year ago
1
Bump tensorflow from 1.14.0 to 2.6.4

#199 dependabot[bot] closed 2 years ago
1
OverflowError

#198 Daming-TF opened 2 years ago
0
download dataset error

#197 qimg412 opened 2 years ago
4
Update requirements.txt

#196 yunhuang1241 closed 2 years ago
0
My question

#195 Messiz opened 2 years ago
2
TranslationDataset is now deprecated in torchtext

#194 imkzh opened 2 years ago
2
CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

#193 lisp2047 opened 2 years ago
1
Bump tensorflow from 1.14.0 to 2.5.3

#192 dependabot[bot] closed 2 years ago
1
Beam search torch.log

#191 actforjason opened 2 years ago
0
@SeoroMin

#190 SeoroMin opened 2 years ago
1
removed a singleton-comparison pitfall

#189 NaelsonDouglas opened 2 years ago
0
learning rate update before optimizer.step()

#188 AlbertiPot closed 3 years ago
1
why PEpos+k can be represented as a linear function of PEpos?

#187 myrainbowandsky opened 3 years ago
0
Confusion regarding embedding space

#186 IamAdiSri opened 3 years ago
2
some problem

#185 zhLia opened 3 years ago
3
Bump tensorflow from 1.14.0 to 2.5.1

#184 dependabot[bot] closed 2 years ago
1
变量名能不能起得好读一点？

#183 Xelawk closed 2 years ago
0
Attention value is strange

#182 YPatrickW opened 3 years ago
1
test error

#181 LonelyPlanetIoT opened 3 years ago
2
[fix]: remove-extra-arg

#180 yinchimaoliang opened 3 years ago
0
MultiHeadAttention input shape

#179 Superklez opened 3 years ago
0
Input sequence dimensions of MultiHeadAttention

#178 Superklez closed 3 years ago
1
Incorrect implementation?

#177 weilueluo closed 3 years ago
2
The results of the translate function.

#176 zshyang opened 3 years ago
4
If we change the code in Model.py like this, the convergence speed would be faster.

#175 Sry2016 opened 3 years ago
1
why dropping last example with patch_src&patch_trg function @train.py

#174 pluspluswu opened 3 years ago
1
How to deal with the UNK_TOKEN?

#173 lhy2749 opened 3 years ago
0