microsoft / MASS

MASS: Masked Sequence to Sequence Pre-training for Language Generation
https://arxiv.org/pdf/1905.02450.pdf
Other
1.12k stars 206 forks source link

get-data-nmt.py language direction error #54

Open 520jefferson opened 5 years ago

520jefferson commented 5 years ago

i want to finetune en-de model , then i use blew to genearte data. ./get-data-nmt.sh --src en --tgt de --reload_codes model_ende/c odes_ende --reload_vocab model_ende/vocab_ende

but i met a error in this line : if [ "$SRC" > "$TGT" ]; then echo "please ensure SRC < TGT"; exit; fi

then is use --src de --tgt en, then i can run successfully. then i finetune on mass_ende_1024.pth, will the direction will affect my result? why the script limit $SRC > $TGT ?

on the other hand, the codes and vocab will be generated in /data/process/de-en, but i already set --reload_codes --reload_vocab. so weird!

520jefferson commented 5 years ago

after i generate data using get-data-nmt.sh ( --src de --tgt en), then i finetuen with python3 train.py --exp_name unsupMT_ende
--data_path ./data/processed/de-en
--lgs en-de
--bt_steps en-de-en,de-en-de
--encoder_only false --emb_dim 1024 --n_layers 6 --n_heads 8 --dropout 0.1 --attention_dropout 0.1 --gelu_activation true --tokens_per_batch 2000 --batch_size 32 --bptt 256 --optimizer adam_inverse_sqrt,beta1=0.9,beta2=0.98,lr=0.0001 --epoch_size 200000 --max_epoch 30 --eval_bleu true
--reload_model ./model_ende/mass_ende_1024.pth,./model_ende/mass_ende_1024.pth

but i met this: image

520jefferson commented 5 years ago

why we need "please ensure SRC < TGT", i used the update model and --src de --tgt en, then i can finetune en de . but i still cannot translate directly with masss_ft_ende model, maybe i should save all params myself. File "translate.py", line 160, in main(params) File "translate.py", line 77, in main setattr(params, name, getattr(model_params, name)) AttributeError: 'AttrDict' object has no attribute 'bos_index'

520jefferson commented 5 years ago

i use de-en to generate data and en-de to train using the newly updated en-de model.

but translate.py still can't translate using en-de model. maybe i should reserve all the parameter.

StillKeepTry commented 5 years ago

@520jefferson You means you can not translate it in any directions or just en->de?

StillKeepTry commented 5 years ago

I upload an ende model under this link by fixed params. Can you have a try? Besides, you can also load the pre-trained or fine-tuned weight at the training and then evaluate. (if the training step = 1, it is almost equal to the fine-tuned weight)

520jefferson commented 5 years ago

@StillKeepTry the third update of the model ? i will try to download the new model .

StillKeepTry commented 5 years ago

@520jefferson Previous model can be directly used to fine-tune. The above uploaded model keep the same weight of previous model but just add some params to support translation

520jefferson commented 5 years ago

@StillKeepTry same problem https://github.com/microsoft/MASS/issues/49, i try with new model and codes, but i get the same error.