facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.38k stars 6.4k forks source link

[data2vec] different config between released model[cfg] and default config #4342

Closed ddlBoJack closed 2 years ago

ddlBoJack commented 2 years ago

Hi, I am trying to reproduce data2vec on speech.

I found that the config of the released model you provided is inconsistent with what is stated in the documentation. And some fields are not declared in the code, such as "diversity_weight". So can you please provide a new config or update the code?

And I found that you set ema_layers_only: false in the released model, which means you do ema for the whole transformer including position encoding. Is this setting a better one than the paper claims (share the position encoding)?

Many Thanks!

@alexeib

FlorentMeyer commented 2 years ago

Hi,

I can't find the "diversity_weight" that you are talking about anywhere in the fairseq codebase.

Also, may I ask where you find this parameterization of ema_layers_only: false? Both in Data2VecAudioConfig and in base_librispeech.yaml it is set to true (and it is not overriden by the CLI in the readme either).

alexeib commented 2 years ago

hi, the model that has been released was trained in a separate branch with a bunch of experimental settings that have not been released. we updated the model config so it can be loaded by the released code. please use the official yaml config files to repro the results - you should get very close to what the released model obtains.

ddlBoJack commented 2 years ago

Got it. Thanks!