microsoft / protein-sequence-models

Other
229 stars 28 forks source link

calculate stability and melting temperature from protein sequence #2

Open avilella opened 2 years ago

avilella commented 2 years ago

I am reading the biorxiv preprint and I understand I could use the CARP models to predict/calculate stability and melting temperature from a query protein sequence? Would it be possible to have a utils script to do that from this codebase?

In Facebook's ESM, for example, there is a likelihood score calculator script that does something of the like:

esm/examples/inverse_folding/score_log_likelihoods.py ./examples/inverse_folding/data/5YH2.pdb     ./examples/inverse_folding/data/5YH2_mutated_seqs.fasta --chain C    --outpath output/5YH2_mutated_seqs_scores.csv

Would it be possible to have an equivalent for CARP? Thanks in advance.

yangkky commented 2 years ago

It looks like you want a script for zero-shot mutation effect prediction?

avilella commented 2 years ago

That may be it. Is there a difference between melting temperature (Table 5.4.3) or protein fitness (Table 5.4.2 or Figure 3)? Or are both called from the same function with the same parameters? It would be great to have an example of both if different. Thanks in advance.

yangkky commented 2 years ago

Where are you referencing these tables from?

On Mon, May 23, 2022 at 9:25 AM Albert Vilella @.***> wrote:

That may be it. Is there a difference between melting temperature (Table 5.4.3) or protein fitness (Table 5.4.2 or Figure 3)? Or are both called from the same function with the same parameters? It would be great to have an example of both if different. Thanks in advance.

— Reply to this email directly, view it on GitHub https://github.com/microsoft/protein-sequence-models/issues/2#issuecomment-1134678449, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADEMNWCXA5T2CC7WA3MRHTLVLOBODANCNFSM5WPW3PFA . You are receiving this because you commented.Message ID: @.***>

avilella commented 2 years ago

The CARP preprint? Maybe I am in the wrong repo... image image

yangkky commented 2 years ago

In Figure 3, those are zero-shot predictions using model pseudolikelihoods.

In section 5, the model is fine-tuned on some labeled training data.

On Mon, May 23, 2022 at 9:42 AM Albert Vilella @.***> wrote:

The CARP preprint? Maybe I am in the wrong repo... [image: image] https://user-images.githubusercontent.com/158007/169832780-f9b8a122-fbbb-4c81-8718-8a6451ce6fa2.png [image: image] https://user-images.githubusercontent.com/158007/169832808-404db5e0-4fa2-41a6-8b6a-324becc214ce.png

— Reply to this email directly, view it on GitHub https://github.com/microsoft/protein-sequence-models/issues/2#issuecomment-1134696781, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADEMNWHQQV7QJQWUPEWHHSTVLODM3ANCNFSM5WPW3PFA . You are receiving this because you commented.Message ID: @.***>