BERT - Githubissues

I ran into the same issue and made a few modifications to the way I load my fine-tuned BERT model. As far as I can tell, the newly initialized weights are only for the pooler-layer. In my case, I fine-tuned a BERT model using MLM, which doesn't train the pooling layer as it's not required for the task. In turn, when I save that model it doesn't include those parameters and when I re-load it it produces the error you mentioned.

From what I understand, the GD model also does not use the pooled output and uses the BertModelWarper() simply to access the outputs from each state more easily. In the GroundingDINO module (from models/groundingdino.py) in lines 267-269 the code uses the last hidden state and trains another linear layer on top of it, ignoring the pooling layer.

I think when loading a fine-tuned BERT model, only the pooler-layer weights are newly initialized, and not the fine-tuned parameters. So in theory, I think for using GD the warning doesn't really matter. For my own sanity to ensure at least nothing is initialized randomly, I implemented this function to load my fine-tuned BERT model and re-initialize the pooler weights with the ones from a pre-trained BERT model.

def mlm2bm(text_encoder_type, model_path):
    ''' Load fine-tuned MLM model as BertModel and replace the pooling layer 
        weights with the original ones. MLM models don't require a pooling layer. '''

    original_model = BertModel.from_pretrained(text_encoder_type)

    print('Loaded original model')

    # Save the original pooling layer weights
    original_pooler_weight = original_model.pooler.dense.weight.clone()
    original_pooler_bias = original_model.pooler.dense.bias.clone()

    # Load fine-tuned MLM model as BertModel
    fine_tuned_model = BertModel.from_pretrained(model_path)

    # Replace the pooling layer weights with the original ones
    fine_tuned_model.pooler.dense.weight = torch.nn.Parameter(original_pooler_weight)
    fine_tuned_model.pooler.dense.bias = torch.nn.Parameter(original_pooler_bias)

    print('Replaced pooler weights!')

    return fine_tuned_model

Hope this helps!

longzw1997 / Open-GroundingDino

BERT #62