machelreid / subformer

The code for the Subformer, from the EMNLP 2021 Findings paper: "Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers", by Machel Reid, Edison Marrese-Taylor, and Yutaka Matsuo
https://arxiv.org/abs/2101.00234
MIT License
14 stars 3 forks source link

How to reproduce the result of abstractive summarization? #2

Closed minjieyuan closed 3 years ago

minjieyuan commented 3 years ago

Dear Subformer authors, Hi! Thanks for sharing your codes! I want to reproduce the results of abstractive summarization, but I'm confused about how to set the training parameters. I use the same scripts of Training but the result is bad. Could you kindly provide the scripts for summarization task? Thank you very much!

machelreid commented 3 years ago

Hi @minjieyuan!

I've improved the readme with the instructions for summarization. Let me know what you think!

Best, Machel

minjieyuan commented 3 years ago

@machelreid Thank you!

minjieyuan commented 3 years ago

Hi @minjieyuan!

I've improved the readme with the instructions for summarization. Let me know what you think!

Best, Machel

@machelreid Sorry, I still have some errors. The error is as follows.

File "/path/to/subformer/fairseq/modules/subformer_layer.py", line 456, in forward query=x * self.hadamard_encoder_attn[0], RuntimeError: The size of tensor a (512) must match the size of tensor b (1024) at non-singleton dimension 2

Could you help please?

machelreid commented 3 years ago

Should be fixed!