Open danielparton opened 9 years ago
Awesome summary! Thanks for putting this together.
Let's at least implement the "no disulfide bonds" option in addition to your current automated method. The manual method would also be nice, and we can do that now if not too difficult.
Let's definitely add a section to the paper on this, modeled after your "tentative thoughts" comments.
This is some analysis of disulfide bonds in protein kinases, to help with deciding how to handle them in Ensembler. Also, this is a long message - for a quick summary, see the "My tentative conclusions" section.
I have analyzed the protein kinase templates we are using for the kinome project - these correspond to the 4433 protein kinase domains annotated in UniProt from any species. Of these domains, 41 have a SSBOND annotation in the original PDB structure which involves a residue within the span of the kinase domain. This analysis excludes disulfide bonds formed between chains, and those with the same residue index, since these represent disulfide bonds formed between crystal subunits.
The 41 templates come from 12 different protein kinases, 11 of which are human, and 2 of which are human TKs. These are the UniProt entry names:
Analysis of a few example template structures with disulfide bonds
More detail on FAK1
(Note: FAK1 is also known as PTK2)
There are 16 different FAK1 PDB entries which all have a disulfide bond in the same position. However, the following paragraph is the only discussion I can find in the literature.
"A striking feature of the FAK kinase is the presence of an intramolecular disulphide bond in the N-terminal lobe of the kinase Figure 2 and Figure 4. The disulphide spans a short turn of four residues between C456 and C459. This turn closely resembles a type-I β turn, with residues n and n + 3 covalently linked through the disulphide bond. This unusual feature is located on the surface of the N-terminal lobe adjacent to the αC helix (Figure 2B). Cysteines 456 and 459 are conserved in vertebrate FAK sequences, suggesting a possible role in kinase function (Figure 4B). The proximity of the disulphide to the αC helix suggests also that its role may lie in fine-tuning the orientation of this helix. Disulfide bonds are extremely rare in cytoplasmic proteins because of the reducing nature of the intracellular environment. It is unknown whether the C456-C459 bond formed in the FAK crystal structure is maintained in vivo, especially since this feature is located close to the protein surface. The corresponding structural region in other kinases has been implicated in the regulation of kinase activity through protein-protein interactions [50], and it is conceivable that the disulphide bond might be protected upon binding of regulatory proteins."
My tentative conclusions
Disulfide bonds in cytosolic proteins are thought to be extremely rare due to the reducing environment. Furthermore, only a small proportion of protein kinase domain PDB structures have SSBOND annotations, and these may not even be present physiologically, since crystal conditions ≠ cytosol. Therefore, I would suggest that the general approach for most cytosolic proteins should be to ignore disulfide bonds. And I suggest we take this approach for the current TK project, including FAK1.
As @jchodera was saying in #21, it would probably also be helpful to implement the following functionality:
Also, for reference, @jchodera mentioned this paper which indicates that kinase disulfide bonds may play some role under oxidative conditions: http://www.ncbi.nlm.nih.gov/m/pubmed/21078955/
Data for the 41 disulfide bonds
First line is the template ID. Second is the SSBOND line from the PDB file. (No templates have > 1 disulfide bond within their span)
Analysis code: https://gist.github.com/danielparton/c825dec98c360b428de7