Use dropout in roberta_enc_dec arch

I'm trying to train a seq2seq using a pretrained CamemBERT encoder. Stupid question, is there a reason why we can't use --dropout with the roberta_enc_dec architecture? https://github.com/pytorch/fairseq/blob/801a64683164680562c77b688d9ca77fc3e0cea7/fairseq/models/roberta/enc_dec.py#L20 I didn't go check thoroughly, so I might have missed something.

Repro

Command:

fairseq-train /private/home/louismartin/dev/text-simplification/resources/datasets/_a94fa8d6125989eecba66b625ac9a032/fairseq_preprocessed_complex-simple --task translation --source-lang complex --target-lang simple --save-dir /private/home/louismartin/dev/text-simplification/experiments/fairseq/local_1619010533994/checkpoints --update-freq 128 --no-epoch-checkpoints --save-interval 999999 --validate-interval 999999 --max-update 50000 --save-interval-updates 100 --keep-interval-updates 1 --patience 10 --max-sentences 64 --seed 933 --distributed-world-size 1 --distributed-port 15952 --fp16 --arch 'roberta_enc_dec' --pretrained-mlm-checkpoint /private/home/louismartin/tmp/camembert-base/model.pt --max-tokens 4096 --lr 3e-05 --warmup-updates 500 --truncate-source --share-all-embeddings --share-decoder-input-output-embed --reset-optimizer --reset-dataloader --reset-meters --required-batch-size-multiple 1 --criterion 'label_smoothed_cross_entropy' --label-smoothing 0.1 --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 --optimizer 'adam' --adam-betas '(0.9, 0.999)' --adam-eps 1e-08 --clip-norm 0.1 --lr-scheduler 'polynomial_decay' --skip-invalid-size-inputs-valid-test --find-unused-parameters --total-num-update 50000

Error:

error: unrecognized arguments: --dropout 0.1 --attention-dropout 0.1

@Celebio @gwenzek

facebookresearch / fairseq

Use dropout in roberta_enc_dec arch #3497

Repro