facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
3.16k stars 627 forks source link

Is the current ESM Metagenomic Atlas called version 0 or version 1? #384

Closed tomgoddard closed 1 year ago

tomgoddard commented 1 year ago

I've made the ChimeraX molecular visualization software able to fetch ESM Metagenomic Atlas models. In order to handle future versions of the database there is a version setting when fetching a file. I realize there is only one version of the atlas now. But it would be useful to know whether the current version of the atlas is called "version 0" or "version 1" so I can set the appropriate default value in ChimeraX. I see the atlas.fasta file from ticket #366 gives the sequences as

s3://dl.fbaipublicfiles.com/esmatlas/v0/full/atlas.fasta

which has v0 in the path making me think this version of the atlas may be called version 0.

tomsercu commented 1 year ago

Yes we call this v0, as it's folded with esmfold_v0. While the model in the API is retrained with more recent data, ie esmfold_v1

tomgoddard commented 1 year ago

Thank you for the fantastic response to my questions. I did not know that that the atlas used a different network (v0) from the prediction server (v1). It would be great if the versions in the atlas and prediction server were easy to find on the web pages. I see the prediction server version (esmfold_v1) listed on the following page and adding to that page that the current Atlas structures were predicted with esmfold_v0 would be helpful.

https://esmatlas.com/about

This helps biology researchers who would then know that a new prediction of a structure already in the Atlas may produce a different (better) result since predictions are using a newer version of the network.