Closed cece95 closed 4 years ago
Hi,
What do you want to do exactly? I guess it would be possible but not straightforward.
Best, Louis
So basically I have new dataset for headlines simplification and I would like to take the pretrained model and fine tune it on the new dataset
Ok then I guess that the way to do it would be to user the train.py
script but modify the code so that fairseq loads a pretrain model instead of initializing a random seq2seq.
The train.py
script calls fairseq_train_and_evaluate
which in turns calls fairseq_train
.
In the fairseq_train
function you can see that the fairseq parameters for the fairseq-train
command are hardcoded in the function.
A quick hack would be to hardcode a new cli parameter to load the pretrain model: it's --restore-file
in the last fairseq version but it might be different for the fairseq version needed for ACCESS (check the output of fairseq-train --help
).
You would also need to create a new dataset that you provide as input to fairseq_train_and_evaluate
in ressources/datasets/
, you can look at resources/datasets/wikilarge
for the format.
Thank you very much! I'll have a try immediately
Hi, i have the dataset in the correct format and i managed to setup the restore of the model but now i'm getting this error
`Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/fairseq/utils.py", line 76, in load_model_state model.load_state_dict(state['model'], strict=True) File "/usr/local/lib/python3.7/site-packages/fairseq/models/fairseq_model.py", line 66, in load_state_dict super().load_state_dict(state_dict, strict) File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 845, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for TransformerModel: size mismatch for encoder.embed_tokens.weight: copying a param with shape torch.Size([10144, 512]) from checkpoint, the shape in current model is torch.Size([1296, 512]). size mismatch for decoder.embed_out: copying a param with shape torch.Size([10048, 512]) from checkpoint, the shape in current model is torch.Size([1168, 512]). size mismatch for decoder.embed_tokens.weight: copying a param with shape torch.Size([10048, 512]) from checkpoint, the shape in current model is torch.Size([1168, 512]).
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "scripts/train.py", line 50, in
Thanks in advance Cece
Hi again,
That's a good point, indeed you would need to provide the dictionary that was used for the original model to preprocess the data.
First you need to locate the dict.complex.txt
and dict.simple.txt
files that were used with the pretrained model (downloadable here).
Then you will need to provide these paths as arguments to fairseq-preprocess
, by hardcoding the arguments in the fairseq_preprocess
function.
These arguments should be something like --srcdict
for the complex side and --tgtdict
for the simple side (see fairseq-preprocess --help
).
Please tell me if that works.
Best, Louis
Hi thanks for your quick reply, now the dictionaries are loaded correctly but two things happened:
--raw-text
and --validations-before-sari-early-stopping
in the training are not recognised anymore, so for now i just commented them outraceback (most recent call last): File "scripts/train.py", line 50, in <module> fairseq_train_and_evaluate(**kwargs) File "/home/cece/Desktop/access_to_run/access/utils/training.py", line 18, in wrapped_func return func(*args, **kwargs) File "/home/cece/Desktop/access_to_run/access/utils/training.py", line 29, in wrapped_func return func(*args, **kwargs) File "/home/cece/Desktop/access_to_run/access/utils/training.py", line 38, in wrapped_func result = func(*args, **kwargs) File "/home/cece/Desktop/access_to_run/access/utils/training.py", line 50, in wrapped_func result = func(*args, **kwargs) File "/home/cece/Desktop/access_to_run/access/fairseq/main.py", line 121, in fairseq_train_and_evaluate fairseq_train(preprocessed_dir, exp_dir=exp_dir, **train_kwargs) File "/home/cece/Desktop/access_to_run/access/fairseq/base.py", line 181, in fairseq_train train.main(train_args) File "/home/cece/.local/lib/python3.7/site-packages/fairseq_cli/train.py", line 111, in main extra_state, epoch_itr = checkpoint_utils.load_checkpoint(args, trainer) File "/home/cece/.local/lib/python3.7/site-packages/fairseq/checkpoint_utils.py", line 137, in load_checkpoint reset_meters=args.reset_meters, File "/home/cece/.local/lib/python3.7/site-packages/fairseq/trainer.py", line 282, in load_checkpoint self.optimizer.load_state_dict(last_optim_state, optimizer_overrides) File "/home/cece/.local/lib/python3.7/site-packages/fairseq/optim/fairseq_optimizer.py", line 72, in load_state_dict self.optimizer.load_state_dict(state_dict) File "/home/cece/.local/lib/python3.7/site-packages/torch/optim/optimizer.py", line 116, in load_state_dict raise ValueError("loaded state dict contains a parameter group " ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's group
And in this case i can't really wrap my head around it, do you have any idea?
You need to install a specific fork of fairseq for training models with ACCESS as per the readme pip install --force-reinstall fairseq@git+https://github.com/louismartin/fairseq.git@controllable-sentence-simplification
This might be solved by installing the specific version of 1. as well.
Thank you, it ended up working perfectly, the only note is that i had to downgrade to sacrebeu 1.4.4
because all newer version gave me errors such as the tokenizer 13a that you already had an issue for and 1.4.5 required the IPA dictionary
Thank you very much for your help :)
Best Cesare
Hi! Is it possible to perform transfer learning on this model? I couldn't find any straightforward way to do it It may very well be that i missed something but any suggestion is appreaciated
Best Cesare