issues
search
machelreid
/
subformer
The code for the Subformer, from the EMNLP 2021 Findings paper: "Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers", by Machel Reid, Edison Marrese-Taylor, and Yutaka Matsuo
https://arxiv.org/abs/2101.00234
MIT License
14
stars
3
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Could you release the summary of CNN/Daily Mail decoded by the model?
#5
wxdwlai
opened
2 years ago
0
Shared weights update during backpropagation
#4
qianlou
opened
3 years ago
0
ModuleNotFoundError: No module named 'fairseq.data.multilingual_denoising_dataset'
#3
qianlou
closed
3 years ago
5
How to reproduce the result of abstractive summarization?
#2
minjieyuan
closed
3 years ago
4
Core codes for the sandwich weight sharing
#1
qianlou
closed
3 years ago
2