facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.17k stars 6.37k forks source link

Use dropout in roberta_enc_dec arch #3497

Open louismartin opened 3 years ago

louismartin commented 3 years ago

I'm trying to train a seq2seq using a pretrained CamemBERT encoder. Stupid question, is there a reason why we can't use --dropout with the roberta_enc_dec architecture? https://github.com/pytorch/fairseq/blob/801a64683164680562c77b688d9ca77fc3e0cea7/fairseq/models/roberta/enc_dec.py#L20 I didn't go check thoroughly, so I might have missed something.

Repro

Command:

fairseq-train /private/home/louismartin/dev/text-simplification/resources/datasets/_a94fa8d6125989eecba66b625ac9a032/fairseq_preprocessed_complex-simple --task translation --source-lang complex --target-lang simple --save-dir /private/home/louismartin/dev/text-simplification/experiments/fairseq/local_1619010533994/checkpoints --update-freq 128 --no-epoch-checkpoints --save-interval 999999 --validate-interval 999999 --max-update 50000 --save-interval-updates 100 --keep-interval-updates 1 --patience 10 --max-sentences 64 --seed 933 --distributed-world-size 1 --distributed-port 15952 --fp16 --arch 'roberta_enc_dec' --pretrained-mlm-checkpoint /private/home/louismartin/tmp/camembert-base/model.pt --max-tokens 4096 --lr 3e-05 --warmup-updates 500 --truncate-source --share-all-embeddings --share-decoder-input-output-embed --reset-optimizer --reset-dataloader --reset-meters --required-batch-size-multiple 1 --criterion 'label_smoothed_cross_entropy' --label-smoothing 0.1 --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 --optimizer 'adam' --adam-betas '(0.9, 0.999)' --adam-eps 1e-08 --clip-norm 0.1 --lr-scheduler 'polynomial_decay' --skip-invalid-size-inputs-valid-test --find-unused-parameters --total-num-update 50000

Error:

error: unrecognized arguments: --dropout 0.1 --attention-dropout 0.1

@Celebio @gwenzek

stale[bot] commented 3 years ago

This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!