westlake-repl / SaProt

[ICLR'24 spotlight] Saprot: Protein Language Model with Structural Alphabet
MIT License
323 stars 32 forks source link

Some weights of EsmModel were not initialized from the model checkpoint #12

Closed toooooodo closed 9 months ago

toooooodo commented 9 months ago

Hi, thank you very much for sharing this great work.

After loading the pretrained checkpoint, I encounter some warnings:

Some weights of EsmModel were not initialized from the model checkpoint at SaProt_650M_PDB and are newly initialized: ['esm.embeddings.position_embeddings.weight', 'esm.contact_head.regression.bias', 'esm.pooler.dense.bias', 'esm.contact_head.regression.weight', 'esm.pooler.dense.weight']

My code is

from transformers import EsmTokenizer, EsmModel
tokenizer = EsmTokenizer.from_pretrained("SaProt_650M_PDB")
model = EsmModel.from_pretrained("SaProt_650M_PDB").cuda()
model.eval()

I want to use the pre-trained model to obtain representation of a protein directly without finetuning, but I'm not sure if missing these weights affects the quality of the obtained representation, e.g., the absence of position embedding weights.

LTEnjoy commented 9 months ago

Hi, thank you for being interested in our work!

The absence of some weights is normal when you initialize SaProt, such as position embedding weights or contact head weights, because SaProt adopts Rotary Position Embedding and the contact head was not used when we pre-trained SaProt.

Besides, esm.pooler.dense.weight is not supposed to be initialized as we keep the same strategy following ESM-2. You can initialize your model by

model = EsmModel.from_pretrained("SaProt_650M_PDB", add_pooling_layer=False).cuda()

I hope the answer could resolve your problem!

toooooodo commented 9 months ago

Thanks for your kind and prompt reply!