SanghunYun / UDA_pytorch

UDA(Unsupervised Data Augmentation) implemented by pytorch
Apache License 2.0
275 stars 61 forks source link

Loading Fine-Tuned Pytorch BERT model. #4

Open nvshrao opened 4 years ago

nvshrao commented 4 years ago

Hi, Congratulations on the amazing work. I would like to apply UDA on BERT model with fine-tuned language model using HuggingFace Transformers(https://github.com/huggingface/transformers/blob/master/examples/run_lm_finetuning.py)

The aforementioned code generates .bin pytorch BERT model. To use this, I changed pretrain_file.endswith('.pt'): to pretrain_file.endswith('.bin'): in line 175 of train.py and in config/uda.json gave my model's address for "pretrain_file" and "vocab".

However, this leads to a mismatch in key names for the state_dict. Expected keys being "transformer.embed.tok_embed.weight", "transformer.embed.pos_embed.weight" ... and actual keys being "bert.embeddings.word_embeddings.weight", "bert.embeddings.position_embeddings.weight" .... and so on. One solution is to probably map manually the expected keys to required keys, would you recommend any other solution?

SanghunYun commented 4 years ago

Hello. Thank you for your question. I'm sorry for not being able to reply quickly because of my recent my work.

Your question seems to be about mismatch when loading state_dict from pretrained model. Unfortunately, I can't think of a good way other than the manual map you gave as an example.

I'm sorry for not helping you.