ModuleNotFoundError: No module named 'fairseq.data.multilingual_denoising_dataset'

machelreid / subformer

The code for the Subformer, from the EMNLP 2021 Findings paper: "Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers", by Machel Reid, Edison Marrese-Taylor, and Yutaka Matsuo

https://arxiv.org/abs/2101.00234

MIT License

14 stars 3 forks source link

ModuleNotFoundError: No module named 'fairseq.data.multilingual_denoising_dataset' #3

Closed qianlou closed 3 years ago

qianlou commented 3 years ago

Dear subformer authors,

After I successfully installed PyYAML-5.4.1 antlr4-python3-runtime-4.8 cffi-1.14.5 cython-0.29.23 fairseq hydra-core-1.0.6 importlib-resources-5.1.2 numpy-1.20.2 omegaconf-2.0.6 portalocker-2.0.0 pycparser-2.20 regex-2021.4.4 sacrebleu-1.5.1 torch-1.8.1 tqdm-4.60.0 typing-extensions-3.7.4.3, I tried to run the training script your provided for machine translation. However, I came cross a ModuleNotFoundError for fairseq.data.multilingual_denoising_dataset. Do you know how to solve this issue? Thanks for your help!

Bests, Qian

qianlou commented 3 years ago

I tried the solution listed here https://github.com/pytorch/fairseq/issues/2133. But it does not work.

machelreid commented 3 years ago

Nice find! This was a bug on my part: I've removed it. Let me know if it works.

qianlou commented 3 years ago

Thanks for your reply. The module-not-found error is solved. But there are some new issues. It seems that your training script in readme does not match your code:

Here is an error: train.py: error: argument --criterion: invalid choice: 'label_smoothed_length_cross_entropy' (choose from 'masked_lm', 'sentence_ranking', 'adaptive_loss', 'model', 'nat_loss', 'ctc', 'cross_entropy', 'label_smoothed_cross_entropy', 'wav2vec', 'sentence_prediction', 'composite_loss', 'label_smoothed_cross_entropy_with_alignment', 'legacy_masked_lm_loss', 'vocab_parallel_cross_entropy')

After I changed "--criterion label_smoothed_length_cross_entropy' " into --criterion label_smoothed_cross_entropy', the above issue is solved. However, I met a new issue: unrecognized arguments: --min-lr 1e-9.

Could you kindly help me check the training script so that I can use it for directly training? Thanks for your help!

minjieyuan commented 3 years ago

Thanks for your reply. The module-not-found error is solved. But there are some new issues. It seems that your training script in readme does not match your code:

Here is an error: train.py: error: argument --criterion: invalid choice: 'label_smoothed_length_cross_entropy' (choose from 'masked_lm', 'sentence_ranking', 'adaptive_loss', 'model', 'nat_loss', 'ctc', 'cross_entropy', 'label_smoothed_cross_entropy', 'wav2vec', 'sentence_prediction', 'composite_loss', 'label_smoothed_cross_entropy_with_alignment', 'legacy_masked_lm_loss', 'vocab_parallel_cross_entropy')

After I changed "--criterion label_smoothed_length_cross_entropy' " into --criterion label_smoothed_cross_entropy', the above issue is solved. However, I met a new issue: unrecognized arguments: --min-lr 1e-9.

Could you kindly help me check the training script so that I can use it for directly training? Thanks for your help!

I use 'stop-min-lr' to replace 'min-lr'. I think they may have the same means.

machelreid commented 3 years ago

Yeah they do, thanks @minjieyuan! fairseq changed its config library and I forgot to update this part of the readme (i've changed it now).

Also, the loss should be label_smoothed_cross_entropy (I've updated this part of the readme as well).

Cheers!