facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
3.14k stars 621 forks source link

How to use Zero-shot variant prediction for other protein sequences #588

Open geng-lee opened 1 year ago

geng-lee commented 1 year ago

Hi Developer,

I have a wild-type protein sequence, and I need to make a Zero-shot variant prediction. Are these two inputs necessary? BLAT_ECOLX_Ranganathan2015.csv & BLAT_ECOLX_1_b0.5.a3m

If necessary, how do I generate these two inputs data for my protein? Second, what is an appropriate setting for --offset-idx? How to generate a mutation score matrix heat map?

Thanks! Looking forward to your reply!

Best, Jamie

Amelie-Schreiber commented 1 year ago

I've coded some new versions of the various mutation scoring functions used in the paper. It still uses ESM-2 to calculate them. Perhaps this notebook will be helpful to you

arjan-hada commented 9 months ago

This notebook that I recently published is very complementary to @Amelie-Schreiber's notebook. It focuses on MSA Transformer, ESM2, ESM1v, and ESM1b for zero shot fitness prediction- Zero-shot prediction of functional protein sequence variants. There are more coming in future updates.

niu6211 commented 6 months ago

请问您现在这个问题解决了么,请问BLAT_ECOLX_Ranganathan2015.csv文件怎么获得呀,谢谢