Closed jm-glowienke closed 3 years ago
RESOLVED
XLM-R model does not work directly: https://github.com/pytorch/fairseq/issues/1842 https://github.com/pytorch/fairseq/tree/47fd985269e92735826c05d9160d68dc8e8a9807/examples/cross_lingual_language_model https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.glue.md
Reason: state_dict of XLMR checkpoint, contains more information than the state_dict of the created model. This causes the assertion error
Model run failed
NEW TASKS:
Fixed by using --model-overrides
and skipping adapting the state_dict to pretrained_model
Resources: https://github.com/pytorch/fairseq/blob/master/fairseq/options.py#L468-L475 https://github.com/pytorch/fairseq/issues/3600 https://github.com/huggingface/transformers/pull/12082 https://huggingface.co/transformers/model_doc/mbart.html https://github.com/pytorch/fairseq/commit/54423d3b22a3e7f536e02e9e5445cef9becbd60d https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.pretraining.md
Training is really slow and run into memory issues for mBART
https://tmramalho.github.io/science/2020/06/10/fine-tune-neural-translation-models-with-mBART/
RESOLVED Similar issue with pre-trained model described here https://github.com/pytorch/fairseq/issues/3530
This was helpful: https://github.com/pytorch/fairseq/blob/master/examples/stories/README.md
Check why mask present in translations, then model training run
--> Difficult, it is added to the source dictionary somewhere in task.setup_taks
, but really hard to trace due to the use of override methods
30 epochs of Training result in no BLEU score of roughly 8. mask still present in output and output is not good
Use some kind of BERT model available through fairseq. These are made for language modelling and hence can be used as encoder for the model.
Possible Challenges:
How to adapt dictionary?reset training metricsHave to adapt the model itself to delete or reset the embeddingsMust use moses as tokenizerResources: https://github.com/pytorch/fairseq/tree/master/examples/xlmr https://github.com/pytorch/fairseq/tree/master/examples/cross_lingual_language_model https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.pretraining.md
Tasks: