Continued pretraining from pytorch model

THU-KEG / KEPLER

Source code for TACL paper "KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation".

MIT License

196 stars 23 forks source link

Continued pretraining from pytorch model #9

Closed MichalPitr closed 3 years ago

MichalPitr commented 3 years ago

Hi, I'm curious, do you happen to know if there's a way to use a pre-trained model from Huggingface models and use that to initialize KEPLER training? I was hoping to initialize KEPLER with a RoBERTa pretrained on medical data and use KEPLER pretraining with medical knowledge graphs and medical MLM.

Many thanks, Michal

Bakser commented 3 years ago

Hi, We did not try to convert Huggingface's checkpoints into fairseq checkpoints (used in KEPLER pre-training) since all our KEPLER pre-training are based on native fairseq framework. For your question, my suggestion is to read this code, which is used to convert fairseq RoBERTa checkpoints into Huggingface's and you may inversely convert it in similar way.

Best regards, Xiaozhi

MichalPitr commented 3 years ago

Hi Xiozhi, Yea, I slightly feared that would be necessary. Thanks for the confirmation! If I get to do this, I'll share the code.

Kind regards, Michal