I use a script similar to cola.sh to train and/or evaluate a model for sequence classification.
There are two possible parameters for model state files init_model and pre_trained.
I want and expect the model to be loaded with weights from pre_trained when provided while vocabulary is loaded based on init_model if init_model is one of the provided pretrained models.
However, the model parameters are actually loaded using init_model only. That's because pre_trained flag doesn't have an effect in this fucntion, although I expect pre_trained should override init_model.
Steps to reproduce
Set init_model to deberta-v3-base
Set pre_trained to $PATH_TO_MY_MODEL, which is a path to the pretrained mDeBERTa-V3-Base for example
Check the model parameter after loading, e.g print(model.deberta.encoder.layer[7].output.dense.weight[:5,:4]) after this line
Description
I use a script similar to
cola.sh
to train and/or evaluate a model for sequence classification. There are two possible parameters for model state filesinit_model
andpre_trained
. I want and expect the model to be loaded with weights frompre_trained
when provided while vocabulary is loaded based oninit_model
ifinit_model
is one of the provided pretrained models. However, the model parameters are actually loaded usinginit_model
only. That's becausepre_trained
flag doesn't have an effect in this fucntion, although I expectpre_trained
should overrideinit_model
.Steps to reproduce
init_model
todeberta-v3-base
pre_trained
to $PATH_TO_MY_MODEL, which is a path to the pretrained mDeBERTa-V3-Base for exampleprint(model.deberta.encoder.layer[7].output.dense.weight[:5,:4])
after this lineAdditional information/Environment
My system setup is: