westlake-repl / SaProt

Saprot: Protein Language Model with Structural Alphabet (AA+3Di)
MIT License
347 stars 33 forks source link

How to interpret mutational effect output? #44

Closed dc2211 closed 4 months ago

dc2211 commented 4 months ago

Hi everyone!

first of all, thank you so much for the amazing work! Excited to use this model for my own research.

I was wondering how correctly interpret the output for the mutational effect prediction, as shown in the provided example.

Many thanks!

LTEnjoy commented 4 months ago

Hi, thank you for being interested in our work!

Do higher values mean less tolerated mutations?

For the provided example, higher values mean those mutations have better fitness. Specifically, if the value is great than 0, it means our model predicts the mutant will have better fitness than the wild type, and vice versa.

Does this also counts as a zero-predictions, and if not, how does it compare to the mutation_zeroshot.py script?

Yes, this is zero-shot prediction. However, the output of the muation_zeroshot.py script is the spearman score for each mutation dataset. High value means the predictions made by SaProt has great correlation with the fitness measured by experiments, which means the predictions are accurate. So the value is not about the mutational effect itself.

Hope this could resolve your problem:)

dc2211 commented 4 months ago

Thank you so much for the prompt response! it helps a lot :)