facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
3.16k stars 627 forks source link

When using ESM-MSA-1b, is the MSA generated in real-time from the input sequence or is it fixed? #235

Closed walt676 closed 2 years ago

walt676 commented 2 years ago

Thank you for sharing such outstanding work. I looked up the MSA Transformer related example code under the example folder, but it looks like msa is read from a file. For the new input sequence, do I need to generate a new msa file? If so, in which part of the repo can this part of the code be found (code for generating msa file according to input sequence)?

tomsercu commented 2 years ago

Yes the MSA needs to be provided, this can be done using the standard tools eg HHblits, JackHMMER, MMSeqs etc.