Haddox / score_monomeric_designs

Computational pipeline for scoring protein designs using a variety of biophysical metrics
3 stars 1 forks source link

Add new metrics that Gabe suggested #5

Open Haddox opened 5 years ago

Haddox commented 5 years ago

Counts of all possible rotamers Dipeptide counts in sequence Counts of all amino acid type pair interactions (serine residue or serine side chain within 4A of arg sidechain for example) rosetta score of 2, 3, 4, 5 residue sequences - what is good/bad? normalized to what? rosetta scores of small 3d neighborhoods TERM-based analysis counts of amino acids in certain secondary structures, or dipeptides in certain secondary structures more granular fragment analysis, such as the quality of specific regions of the chain

Haddox commented 5 years ago

In commit 0373aa2, added code that computes amino-acid frequencies over all protein, in different secondary structures, and in different layers. Also computes di-peptide frequencies

Haddox commented 5 years ago

Counts of all possible rotamers --Brian says no explicit function to do this. Could get list and then decide which one it's most like based on torsional angles

Counts of all amino acid type pair interactions (serine residue or serine side chain within 4A of arg sidechain for example) --Alanine scan. Watch for other residues that change energy. --Neighborhood residue selector then manual counting

rosetta score of 2, 3, 4, 5 residue sequences - what is good/bad? normalized to what? --pose.energies().residue_total_energy(3) for site 3

rosetta scores of small 3d neighborhoods --very hard to do it at the atom level, but could compute energies for all neighbors

TERM-based analysis

counts of amino acids in certain secondary structures, or dipeptides in certain secondary structures

more granular fragment analysis, such as the quality of specific regions of the chain --Brian did this. The most important regions are the edges of loops (e.g., where a helix turns into a loop). Maybe could have different categories. Brian did this for 3mers using the following dssp pattern --Bcov: dssp: xxHHHHHHHHhLLLLLhHHHHHHhLLLeEEEEEeLLLLxx

Haddox commented 5 years ago

In commit 6f8610b, added code for analyzing fragment quality at the level of individual sites (what is the average quality of all 9mer fragments centered upon a given site?) and within different kinds of secondary structures (what is the quality of all fragments that are centered upon a site within a given type of secondary structure?). Will add description of new metrics to README.

@grocklin: Added a few new metrics described briefly above. It looks like fragment quality is computed using 9mers. Did you ever have code to analyze the fragment quality of 3mers?

grocklin commented 5 years ago

I never did analysis of 3mers.

On Thu, Feb 14, 2019 at 12:33 PM Hugh Haddox notifications@github.com wrote:

In commit 6f8610b https://github.com/Haddox/score_monomeric_designs/commit/6f8610bf08d2e12d4ae8c0375371f27b2384e571, added code for analyzing fragment quality at the level of individual sites (what is the average quality of all 9mer fragments centered upon a given site?) and within different kinds of secondary structures (what is the quality of all fragments that are centered upon a site within a given type of secondary structure?). Will add description of new metrics to README.

@grocklin https://github.com/grocklin: Added a few new metrics described briefly above. It looks like fragment quality is computed using 9mers. Did you ever have code to analyze the fragment quality of 3mers?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Haddox/score_monomeric_designs/issues/5#issuecomment-463783943, or mute the thread https://github.com/notifications/unsubscribe-auth/AErvp7UOpuSmzijjQUfPZBDPZ-Y-NNXYks5vNcgbgaJpZM4aWepW .

Haddox commented 5 years ago

Add average length of helices / strands / loops? Fraction of internally satisfied bb hbond donors/acceptors?

Haddox commented 5 years ago

In commit 79f9b02, added code for analyzing per-residue Rosetta energies of fragments in primary sequence.

Haddox commented 5 years ago

In commit 72275b2, added code to compute energies of 3D neighborhoods around each residue

Haddox commented 5 years ago

In commit 8e2daa86dc8cbcec217e1e173d91da1e30eb8b8b, added code to compute the number of times each of the 20 amino acids contact each other in the structure