mit-han-lab lite-transformer issues

mit-han-lab / lite-transformer

[ICLR 2020] Lite Transformer with Long-Short Range Attention

https://arxiv.org/abs/2004.11886

Other

596 stars 81 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Can‘t find the cnn branch,

#43 gwyanCN closed 2 months ago
1
model pruning

#42 AIikai closed 2 months ago
1
wmt16_en_de dataset link

#41 topbookcc closed 2 months ago
2
About data !

#40 veryhigh closed 2 years ago
1
about padding！！！

#39 sanwei111 closed 2 years ago
2
about dynamicconv_cuda

#38 sanwei111 closed 2 years ago
1
about kernel size

#37 sanwei111 closed 2 years ago
1
TransformerEncoderLayer

#36 sanwei111 closed 2 months ago
5
about the global and local features in fig 3

#35 sanwei111 closed 2 months ago
3
in the paragra 4 of paper

#34 sanwei111 closed 3 years ago
1
in paragra 4 of

#33 sanwei111 closed 3 years ago
1
change torch.div to torch.floor_divide @line81

#32 realzza closed 3 years ago
0
How to measure the FLOPs/MACs?

#31 ranery closed 3 years ago
2
Can not get the result as the paper if train the transformer from scratch.

#30 tomshalini closed 2 months ago
3
Error while testing the model

#29 tomshalini closed 3 years ago
8
Missing Data Preparation section for the CNN / DailyMail dataset

#28 cronopioelectronico closed 3 years ago
1
Fix path in CNNDM test

#27 cronopioelectronico closed 3 years ago
2
Please share your quantization, quantization+pruning checkpoints

#26 kishorepv closed 2 months ago
2
Error while evaluating model

#25 kishorepv closed 3 years ago
9
Export model to ONNX

#24 suyuzhang closed 2 months ago
1
transfomer model with different paramters

#23 ChuanyangZheng closed 3 years ago
3
Quantization

#22 zilunpeng closed 3 years ago
1
What functions are achieved in the cu code? The cu code is too hard for me to understand. Thank you.

#21 guotong1988 closed 3 years ago
1
Why do you recode the cpp code and cu code? What function is necessary?

#20 guotong1988 closed 3 years ago
1
Could you please point out the core code, as there are too many fairseq code. Thank you!

#19 guotong1988 closed 3 years ago
2
Will you release the TensorFlow code in the future?

#18 guotong1988 closed 3 years ago
1
CNN\DM dateset preprocess (bpe 30K)

#17 Wangt-CN closed 3 years ago
1
wmt14 en-fr data processing problem

#16 macn3388 closed 3 years ago
3
Is there any link for downloading iwslt14.de-en pretrained model?

#15 macn3388 closed 3 years ago
1
Is there any guidance on preparing the cnndm dataset?

#14 macn3388 closed 3 years ago
1
Data preprocessing

#13 swgu98 closed 3 years ago
1
Model size confuse

#12 zml24 closed 4 years ago
1
Could you share Quantify and Pruning script? Thank you very much!

#11 fansiawang closed 4 years ago
2
Fairseq cli fix redirect

#10 chenw23 closed 4 years ago
1
Remove duplicate entries of min-lr

#9 chenw23 closed 4 years ago
1
register_task of abstractive summarization

#8 zhangyongwei756 closed 4 years ago
1
DeprecationWarning

#7 ghost closed 4 years ago
1
Wrong key_padding_mask position?

#6 godweiyang closed 3 years ago
3
could you share the tensorboard log file? thank you so much!

#5 luogen1996 closed 2 months ago
1
Model Compression

#4 kalyangvs closed 4 years ago
5
Applying factorized embedding

#3 asharma20 closed 4 years ago
4
training config for wikitext103

#2 pichuang1984 closed 4 years ago
4
Summarization checkpoint release ?

#1 astariul closed 4 years ago
1