mheinzinger / ProstT5

Bilingual Language Model for Protein Sequence and Structure
MIT License
148 stars 13 forks source link

Sequence conservation model described in the paper #25

Open rakeshr10 opened 1 week ago

rakeshr10 commented 1 week ago

Hi @mheinzinger, How do I get access to this 'Supervised learning: per-residue conservation' model described in the paper using ConSurf10k.

mheinzinger commented 1 week ago

Hi; I did not upload the checkpoint for this specific model yet but can do if needed. I mostly trained it to get a more diverse perspective on the information accessible from the embeddings. If you are really interested in conservation prediction, I would simply recommend using the following: https://github.com/Rostlab/VESPA?tab=readme-ov-file#step-2-conservation-prediction This is conservation prediction based on our ProtT5 model (which performs better than ProstT5 for this task) which we have been successfully been using for SAV effect prediction.

rakeshr10 commented 1 week ago

Thanks. Let me try the tool you provided and see if it works.