primaryodors / primarydock

PrimaryOdors.org molecular docker.
Other
6 stars 4 forks source link

Utility for examining conservation of residues. #265

Closed electronicsbyjulie closed 1 year ago

electronicsbyjulie commented 1 year ago

Suppose it were important to check if a given BW number, say 3.49, were always an Asp residue. It would be helpful to have a tool that searches data/sequences_aligned.txt for a given BW number in all protein sequences and returns back with a breakdown of which amino acids occur in that position at what frequencies.

Even better would be if it could list proteins that have at least one known agonist and have a different residue from some threshold like 80% of the list.

I'm thinking of an output format along the lines of:

Results for BW 3.49:

AA  Qty   %     Proteins
-------------------------------------------
D   281   67
E   116   29
N     3    1    OR4W20 OR6P66 OR7x22
Q     1   <1    OR5J19
R     1   <1    OR3G81

(Note the above data are entirely fictional.)

Since in this hypothetical example, the two most common AAs represent more than the threshold 80%, all rows after the second list "aberrant" proteins.