Closed AaranWang closed 5 months ago
Hi,
If you have sufficient mutation data, you can fine-tune Saprot on the data and make prediction using the regression output head (not in zero-shot manner). The base model provides the ability to predict mutational effect in a zero-shot way, which means you don't have to further fine-tune Saprot.
We highly recommend you use SaprotHub to effortlessly predict the mutational effects with few clicks, see here.
How can I use the regression output head to predict mutations? Is there a tutorial or reference available? Thank you.
It might be complicated to code for a complete pipeline. You have to process your own data, fine-tune your model and make inference. SaprotHub also allows to fine-tune Saprot with your private data without any machine learning background, and we have provided detailed instructions for you to process step by step with few clicks, see here.
Thank you, I'll try.
I understand the complexity of completely fine-tuning the model on my own data. I'll try SaprotHUB later, but I believe that learning and gaining experience in fine-tuning this model is essential for me to enter this field. Could you provide me with some advice on where to begin? Thank your very much.
Sure! We have provided a simple example to fine-tune Saprot on Thermostability task, see here. If you want to fine-tune Saprot with your own data, please first convert your data into the LMDB format (see the LMDB files for Thermostability as reference). Then you can modify the config file to replace the dataset path with your data path.
Finally you can run the script to start training.
Thank you for your kind reply. My data is mutation data, similar to the ProteinGym format, and differs from thermostability data. Is there any additional processing I need to do? Or can I add you on WeChat? I might need to bother you with similar questions in the future. Thanks.
No problem. You can email your account to me and I'll add you on the WeChat for further discussion.
Thank you very much.
Should I use the predictive mutational effect script directly, or should I first fine-tune Saprot with my own mutation data and then use the fine-tuned version? Thank you.