Some bugs about model's architecture

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

MIT License

30.38k stars 6.4k forks source link

🐛 Bug

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

I find that when I use cmd "git clone https://github.com/pytorch/fairseq" to install fairseq, there are some bugs. For example, in
“transformer_iwslt_de_en”, the parameter "--encoder/decoder-ffn-embed-dim" should be 1024, but even though I have used "-arch transformer_iwslt_de_en" in the train command, I still got a model with architecture of "base_architecture" where "--encoder-ffn-embed-dim" is 2048. When I turn to the stable release, this bug never shows. I hope you can fix this.

See error

facebookresearch / fairseq

Some bugs about model's architecture #3761

🐛 Bug

To Reproduce