princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
MIT License
3.36k stars 507 forks source link

while getting embs, we should use sentemb_forward with mlp or just [CLS] emb ? #157

Closed CheungZeeCn closed 2 years ago

CheungZeeCn commented 2 years ago

my unsup model is trained with pooler type 'cls', I want to use it for getting sentence embs;

While getting into the code, I am wondering which way I should use.

Should I loading the mlp weights or not ?

model.py : about line 232

def sentemb_forward(
    cls,
    encoder,
    input_ids=None,
    attention_mask=None,
    token_type_ids=None,
    position_ids=None,
    head_mask=None,
    inputs_embeds=None,
    labels=None,
    output_attentions=None,
    output_hidden_states=None,
    return_dict=None,
):

    return_dict = return_dict if return_dict is not None else cls.config.use_return_dict

    outputs = encoder(
        input_ids,
        attention_mask=attention_mask,
        token_type_ids=token_type_ids,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        output_attentions=output_attentions,
        output_hidden_states=True if cls.pooler_type in ['avg_top2', 'avg_first_last'] else False,
        return_dict=True,
    )

    pooler_output = cls.pooler(attention_mask, outputs)
    if cls.pooler_type == "cls" and not cls.model_args.mlp_only_train:
        pooler_output = cls.mlp(pooler_output)

but in tool.SIMCSE

              inputs = {k: v.to(target_device) for k, v in inputs.items()}
              outputs = self.model(**inputs, return_dict=True)
              if self.pooler == "cls":
                  embeddings = outputs.pooler_output 

since the model is init from

self.model = AutoModel.from_pretrained(model_name_or_path)

the pooler weights are not loaded the mlp weights are not loaded neither. we got:

Some weights of BertModel were not initialized from the model checkpoint xxx and are newly initialized: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']

So I cannot get the same embs output in this way.

gaotianyu1350 commented 2 years ago

Hi,

After training you should use the converter to convert the checkpoints to huggingface-style. Please read the simcse_to_huggingface.py part of readme.

CheungZeeCn commented 2 years ago

Hi,

After training you should use the converter to convert the checkpoints to huggingface-style. Please read the simcse_to_huggingface.py part of readme.

Thanks~