Closed sadanyh closed 7 months ago
It seems that you haven't changed the following two parameters.
"stopping_criterion":"valid_en-fr_mt_bleu,10",
"validation_metrics":"valid_en-fr_mt_bleu",
In your case, if you have two abbreviated languages lang1
and lang2
, and you want to use the blue
metric (there is also accuracy, perplexity and loss) as a stopping criterion and to validate your models, you have to do that.
"stopping_criterion":"valid_lang1-lang2_mt_bleu,10",
"validation_metrics":"valid_lang1-lang2_mt_bleu",
I should mention that the framework supports multilingual translation, so lang1
and lang2
can be chosen from a larger set of languages.
Is this okay?
Yes, that is completely fine. I managed to train with an MLM +TLM objective. I wanted the BLUE score for evaluation as well so I had to change my language initials to match the code.
I have one question on how to use the trained model for translation. The translate.py is the one I should be using right? Does it take a tokenized and PBE text as input?
Thanks so much for your help
No.
translate.py
is for inference.
To train an automatic machine translation model, always use train.py
by specifying the "mt_steps"
objective (see /configs/mt_template.json
)
For example if you want to translate from English (en
) to French (fr
), then "lgs": "en-fr"
and "mt_steps": "en-fr"
.
Note that the system is multilingual, so it is bidirectional for a pair of languages. So you can simultaneously train a model to translate from en
to fr
and from fr
to en
by specifying "lgs": "en-fr"
and "mt_steps": "en-fr,fr-en"
.
You can go further by translating several languages simultaneously. Let's add to our previous languages German (de
) and Italian (it
). Then you can do "lgs": "en-fr-de-it"
and "mt_steps":"..."
. In this case mt_steps (...) will be replaced by all possible combinations of your languages: en-fr,en-de,en-it,fr-en,fr-de,fr-it,de-en, etc
(it's long to specify manually when the number of languages increases).
Note that the system is multi-tasking, so you can simultaneously do clm
(causal language modeling), mlm
(mask language modeling), tml
(translation language modeling), ae
(denoising auto-encoding), bt
(online back-translation) and mt
(machine translation).
ae + bt = unsupervised mt
If you need to understand how all this works you can refer (if not already done) to these papers:
(ae) Extracting and Composing Robust Features with Denoising Autoencoders : https://www.cs.toronto.edu/~larocheh/publications/icml-2008-denoising-autoencoders.pdf
(bt) Improving Neural Machine Translation Models with Monolingual Data : https://arxiv.org/abs/1511.06709
(ae, bt, mt : supervised and unsupervised mt) Phrase-Based & Neural Unsupervised Machine Translation : https://arxiv.org/abs/1804.07755
(mlm) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding https://arxiv.org/abs/1810.04805
(clm) GPT/GPT-2/GPT-3
(mlm, tlm, clm, multi-lingual & cross-lingual mt, both supervised and unsupervised ...) Cross-lingual Language Model Pretraining : https://arxiv.org/abs/1901.07291
(meta-learning) Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks : https://arxiv.org/abs/1703.03400
(our paper : all this + metalearning) : On the use of linguistic similarities to improve Neural Machine Translation for African Languages https://openreview.net/forum?id=Q5ZxoD2LqcI (the updated version will be on arxiv soon)
For another project I'm working on, I integrated a new architecture in the code, TIM (transformers with competitive ensembles of independent mechanisms: https://arxiv.org/abs/2103.00336), which can be used in place of the normal transformer. Also, I integrated code to automatically fine-tune models on text classification tasks (GLUE, XNLI, costum task ...). All these updates are here, I will make everything public with another paper.
I'm trying to reproduice all this with huggingface transformer library : https://github.com/Tikquuss/lm
Thanks a lot for your help. That is quite thorough.
I already used lm_template.json to train a language model with the parameter "mlm_steps":"...". As per your Github, this by default uses my monolingual and parallel datasets (de, en, de-en). Then I used this language model and trained using mt_template.json with parameter "mt_steps":"...". I believe that now I have an MT model for my languages, right?
Now if I want to use it on new test sets for inference, do I use the translate.py? Could you give a hint on how to use it?
Thank you so much again and I will definitely check your new project and best of luck with your future paper.
I am trying to use the translate.py for inference. I get the following error:
Traceback (most recent call last):
File "translate.py", line 141, in
I am not sure what I am doing wrong. My command on the command line is as follows:
cat /user/HS301/m16265/Documents/XML-R/processed/test.en | python translate.py --exp_name mt_enfrde --model_path /user/HS301/m16265/Documents/XML-R/dump_path/mt_enfrde/demo/best-valid_de-en_mt_bleu.pth --src_lang en --tgt_lang de --output_path output
Can you help with this error please if you have any suggestions for solving it? Thank you
Use translate_our.py instead (see https://github.com/Tikquuss/meta_XLM/blob/master/XLM/translate_our.py#L115 for how to use)
Thanks you for your help but I get this error with translate_our.py
Traceback (most recent call last):
File "translate_our.py", line 175, in
I noticed that you have a device variable in the translate_our.py https://github.com/Tikquuss/meta_XLM/blob/master/XLM/translate_our.py#L40
should I be doing something before running the translate_our.py. Thank you for your help.
Go to line 34 of translate_our.py (before logger = initialize_exp(params)
) and set up the device, for example you can add the line of code:
params.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
If other parameters are missing just try to do the same. All parameters are well described in train.py (I encourage you to understand the code well to be able to make some adjustments yourself)
Thank you for the demo notebook. I have trained my MLM+TLM model but I get this error with training with the mt_template.json stage:
Traceback (most recent call last): File "train.py", line 816, in
main(params)
File "train.py", line 554, in main
end_of_epoch(trainer = trainer, evaluator = evaluator, params = params, logger = logger)
File "train.py", line 441, in end_of_epoch
trainer.end_epoch(scores)
File "/content/meta_XLM/XLM/src/trainer.py", line 736, in end_epoch
assert metric in scores, metric
AssertionError: valid_en-fr_mt_bleu
Could you please help with why this error is happening? I am not training on en or fr?