facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.38k stars 6.4k forks source link

[data2vec] different config in released model[cfg] and default config #4323

Closed ddlBoJack closed 2 years ago

ddlBoJack commented 2 years ago

Hi, I am trying to reproduce data2vec on speech.

I found that the config of the released model you provided is inconsistent with what is stated in the documentation. And some fields are not declared in the code, such as "diversity_weight". So can you please provide a new config or update the code? Many Thanks!

@alexeib

ddlBoJack commented 2 years ago

And I found that you set ema_layers_only: false in the released model, which means you do ema for the whole transformer including position encoding. Is this setting a better one than the paper claims (share the position encoding)? Thanks a lot.

kabouzeid commented 2 years ago

And I found that you set ema_layers_only: false in the released model, which means you do ema for the whole transformer including position encoding. Is this setting a better one than the paper claims (share the position encoding)? Thanks a lot.

did you come up with an answer for this?

ddlBoJack commented 2 years ago

And I found that you set ema_layers_only: false in the released model, which means you do ema for the whole transformer including position encoding. Is this setting a better one than the paper claims (share the position encoding)? Thanks a lot.

did you come up with an answer for this?

refer this: https://github.com/facebookresearch/fairseq/issues/4342

kabouzeid commented 2 years ago

Thanks! So ema_layers_only: true is the way to go.