facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
3.26k stars 643 forks source link

Sharing models through Hugging Face #284

Closed osanseviero closed 2 years ago

osanseviero commented 2 years ago

Hi there! Congrats on the launch!

I see you currently save your model checkpoints through a link to a FB server (https://dl.fbaipublicfiles.com/fair-esm/models/esm2_t48_15B_UR50D.pt) . Would you be interested in sharing the model in the Hugging Face Hub? There's an existing FaceBook org with over 300 models where these could be hosted.

The Hub offers free hosting of over 100K models, and it would make your work more accessible and visible to the rest of the ML community. Creating the repos and adding new models should be relatively straightforward if you've used Git before. This is a step-by-step guide explaining the process in case you're interested. Please let us know if you would be interested, and if you have any questions, we're happy to assist.

Happy to hear your thoughts!

cc @NimaBoscarino

tomsercu commented 2 years ago

We're working now with the huggingface team to get ESM-1x and ESM-2 models into 🤗 transformers. I guess then it'll be no Hub automatically? For ESMFold, is Transformers a better place or Hub?

osanseviero commented 2 years ago

I missed we had an ongoing collab :hugs: Once this is supported in transformers, it will indeed be in the Hub automatically, and be much better supported, so let's keep going that way :rocket:

felixgabler commented 2 years ago

Hey, this would be awesome to have! Thank you so much for taking this on 🚀. Do you think it's possible to give a broad timeline by when the models could be available? 😄

hengck23 commented 2 years ago

if esmFold can be released soon, i think many kagglers are eager to use it at protein stability prediction:

Novozymes Enzyme Stability Prediction: Help identify the thermostable mutations in enzymes https://www.kaggle.com/competitions/novozymes-enzyme-stability-prediction

tomsercu commented 2 years ago

Very happy to share that - thanks to the amazing work of @Rocketknight1 and 🤗 team - this is now supported:

Classification tasks with proteins, just like BERT: https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/protein_language_modeling.ipynb

Fold proteins in Colab or your local GPU and export PDB files: https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/protein_folding.ipynb