microsoft / DeBERTa

The implementation of DeBERTa
MIT License
1.91k stars 215 forks source link

Model is not initialized correctly when path to a pretrained model is provided via `pre_trained` #146

Open ThuongTNguyen opened 6 months ago

ThuongTNguyen commented 6 months ago

Description

I use a script similar to cola.sh to train and/or evaluate a model for sequence classification. There are two possible parameters for model state files init_model and pre_trained. I want and expect the model to be loaded with weights from pre_trained when provided while vocabulary is loaded based on init_model if init_model is one of the provided pretrained models. However, the model parameters are actually loaded using init_model only. That's because pre_trained flag doesn't have an effect in this fucntion, although I expect pre_trained should override init_model.

Steps to reproduce

Additional information/Environment

My system setup is: