issues
search
mit-han-lab
/
lite-transformer
[ICLR 2020] Lite Transformer with Long-Short Range Attention
https://arxiv.org/abs/2004.11886
Other
596
stars
81
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Can‘t find the cnn branch,
#43
gwyanCN
closed
2 months ago
1
model pruning
#42
AIikai
closed
2 months ago
1
wmt16_en_de dataset link
#41
topbookcc
closed
2 months ago
2
About data !
#40
veryhigh
closed
2 years ago
1
about padding!!!
#39
sanwei111
closed
2 years ago
2
about dynamicconv_cuda
#38
sanwei111
closed
2 years ago
1
about kernel size
#37
sanwei111
closed
2 years ago
1
TransformerEncoderLayer
#36
sanwei111
closed
2 months ago
5
about the global and local features in fig 3
#35
sanwei111
closed
2 months ago
3
in the paragra 4 of paper
#34
sanwei111
closed
3 years ago
1
in paragra 4 of
#33
sanwei111
closed
3 years ago
1
change torch.div to torch.floor_divide @line81
#32
realzza
closed
3 years ago
0
How to measure the FLOPs/MACs?
#31
ranery
closed
3 years ago
2
Can not get the result as the paper if train the transformer from scratch.
#30
tomshalini
closed
2 months ago
3
Error while testing the model
#29
tomshalini
closed
3 years ago
8
Missing Data Preparation section for the CNN / DailyMail dataset
#28
cronopioelectronico
closed
3 years ago
1
Fix path in CNNDM test
#27
cronopioelectronico
closed
3 years ago
2
Please share your quantization, quantization+pruning checkpoints
#26
kishorepv
closed
2 months ago
2
Error while evaluating model
#25
kishorepv
closed
3 years ago
9
Export model to ONNX
#24
suyuzhang
closed
2 months ago
1
transfomer model with different paramters
#23
ChuanyangZheng
closed
3 years ago
3
Quantization
#22
zilunpeng
closed
3 years ago
1
What functions are achieved in the cu code? The cu code is too hard for me to understand. Thank you.
#21
guotong1988
closed
3 years ago
1
Why do you recode the cpp code and cu code? What function is necessary?
#20
guotong1988
closed
3 years ago
1
Could you please point out the core code, as there are too many fairseq code. Thank you!
#19
guotong1988
closed
3 years ago
2
Will you release the TensorFlow code in the future?
#18
guotong1988
closed
3 years ago
1
CNN\DM dateset preprocess (bpe 30K)
#17
Wangt-CN
closed
3 years ago
1
wmt14 en-fr data processing problem
#16
macn3388
closed
3 years ago
3
Is there any link for downloading iwslt14.de-en pretrained model?
#15
macn3388
closed
3 years ago
1
Is there any guidance on preparing the cnndm dataset?
#14
macn3388
closed
3 years ago
1
Data preprocessing
#13
swgu98
closed
3 years ago
1
Model size confuse
#12
zml24
closed
4 years ago
1
Could you share Quantify and Pruning script? Thank you very much!
#11
fansiawang
closed
4 years ago
2
Fairseq cli fix redirect
#10
chenw23
closed
4 years ago
1
Remove duplicate entries of min-lr
#9
chenw23
closed
4 years ago
1
register_task of abstractive summarization
#8
zhangyongwei756
closed
4 years ago
1
DeprecationWarning
#7
ghost
closed
4 years ago
1
Wrong key_padding_mask position?
#6
godweiyang
closed
3 years ago
3
could you share the tensorboard log file? thank you so much!
#5
luogen1996
closed
2 months ago
1
Model Compression
#4
kalyangvs
closed
4 years ago
5
Applying factorized embedding
#3
asharma20
closed
4 years ago
4
training config for wikitext103
#2
pichuang1984
closed
4 years ago
4
Summarization checkpoint release ?
#1
astariul
closed
4 years ago
1