choderalab / ensembler

Automated omics-scale protein modeling and simulation setup.
http://ensembler.readthedocs.io/
GNU General Public License v2.0
53 stars 21 forks source link

Disulfide bonds #22

Open danielparton opened 9 years ago

danielparton commented 9 years ago

This is some analysis of disulfide bonds in protein kinases, to help with deciding how to handle them in Ensembler. Also, this is a long message - for a quick summary, see the "My tentative conclusions" section.

I have analyzed the protein kinase templates we are using for the kinome project - these correspond to the 4433 protein kinase domains annotated in UniProt from any species. Of these domains, 41 have a SSBOND annotation in the original PDB structure which involves a residue within the span of the kinase domain. This analysis excludes disulfide bonds formed between chains, and those with the same residue index, since these represent disulfide bonds formed between crystal subunits.

The 41 templates come from 12 different protein kinases, 11 of which are human, and 2 of which are human TKs. These are the UniProt entry names:

FAK1_HUMAN (TK)
TIE2_HUMAN (TK)
MKNK2_HUMAN
PKN1_HUMAN
M3K7_HUMAN
MELK_HUMAN
AKT1_HUMAN
SGK1_HUMAN
IKKB_HUMAN
AKT2_HUMAN
KS6A5_HUMAN
FAK1_CHICK

Analysis of a few example template structures with disulfide bonds

(Note: FAK1 is also known as PTK2)

There are 16 different FAK1 PDB entries which all have a disulfide bond in the same position. However, the following paragraph is the only discussion I can find in the literature.

"A striking feature of the FAK kinase is the presence of an intramolecular disulphide bond in the N-terminal lobe of the kinase Figure 2 and Figure 4. The disulphide spans a short turn of four residues between C456 and C459. This turn closely resembles a type-I β turn, with residues n and n + 3 covalently linked through the disulphide bond. This unusual feature is located on the surface of the N-terminal lobe adjacent to the αC helix (Figure 2B). Cysteines 456 and 459 are conserved in vertebrate FAK sequences, suggesting a possible role in kinase function (Figure 4B). The proximity of the disulphide to the αC helix suggests also that its role may lie in fine-tuning the orientation of this helix. Disulfide bonds are extremely rare in cytoplasmic proteins because of the reducing nature of the intracellular environment. It is unknown whether the C456-C459 bond formed in the FAK crystal structure is maintained in vivo, especially since this feature is located close to the protein surface. The corresponding structural region in other kinases has been implicated in the regulation of kinase activity through protein-protein interactions [50], and it is conceivable that the disulphide bond might be protected upon binding of regulatory proteins."

Disulfide bonds in cytosolic proteins are thought to be extremely rare due to the reducing environment. Furthermore, only a small proportion of protein kinase domain PDB structures have SSBOND annotations, and these may not even be present physiologically, since crystal conditions ≠ cytosol. Therefore, I would suggest that the general approach for most cytosolic proteins should be to ignore disulfide bonds. And I suggest we take this approach for the current TK project, including FAK1.

As @jchodera was saying in #21, it would probably also be helpful to implement the following functionality:

Also, for reference, @jchodera mentioned this paper which indicates that kinase disulfide bonds may play some role under oxidative conditions: http://www.ncbi.nlm.nih.gov/m/pubmed/21078955/

Data for the 41 disulfide bonds

First line is the template ID. Second is the SSBOND line from the PDB file. (No templates have > 1 disulfide bond within their span)

1: = AKT1_HUMAN_D0_4EJN_A =
SSBOND   2 CYS A  296    CYS A  310                          1555   1555  2.05

2: = AKT2_HUMAN_D0_1MRY_A =
SSBOND   1 CYS A  297    CYS A  311                          1555   1555  2.04

3: = FAK1_CHICK_D0_2JKK_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.00

4: = FAK1_CHICK_D0_2JKM_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.04

5: = FAK1_CHICK_D0_2JKO_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.05

6: = FAK1_CHICK_D0_2JKQ_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.03

7: = FAK1_HUMAN_D0_1MP8_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.01

8: = FAK1_HUMAN_D0_2ETM_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.03

9: = FAK1_HUMAN_D0_2ETM_B =
SSBOND   2 CYS B  456    CYS B  459                          1555   1555  2.03

10: = FAK1_HUMAN_D0_2IJM_B =
SSBOND   1 CYS B  456    CYS B  459                          1555   1555  2.02

11: = FAK1_HUMAN_D0_3BZ3_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.03

12: = FAK1_HUMAN_D0_3PXK_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.04

13: = FAK1_HUMAN_D0_3PXK_B =
SSBOND   2 CYS B  456    CYS B  459                          1555   1555  2.04

14: = FAK1_HUMAN_D0_4EBV_A =
SSBOND   1 CYS A  459    CYS A  456                          1555   1555  2.03

15: = FAK1_HUMAN_D0_4EBW_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.03

16: = FAK1_HUMAN_D0_4GU6_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.04

17: = FAK1_HUMAN_D0_4GU6_B =
SSBOND   2 CYS B  456    CYS B  459                          1555   1555  2.04

18: = FAK1_HUMAN_D0_4GU9_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.04

19: = FAK1_HUMAN_D0_4GU9_B =
SSBOND   2 CYS B  456    CYS B  459                          1555   1555  2.04

20: = FAK1_HUMAN_D0_4I4E_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.03

21: = FAK1_HUMAN_D0_4I4F_A =
SSBOND   1 CYS A  459    CYS A  456                          1555   1555  2.03

22: = FAK1_HUMAN_D0_4K8A_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.04

23: = FAK1_HUMAN_D0_4K8A_B =
SSBOND   2 CYS B  456    CYS B  459                          1555   1555  2.05

24: = FAK1_HUMAN_D0_4K9Y_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.03

25: = FAK1_HUMAN_D0_4KAB_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.04

26: = FAK1_HUMAN_D0_4KAB_B =
SSBOND   2 CYS B  456    CYS B  459                          1555   1555  2.05

27: = FAK1_HUMAN_D0_4KAO_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.04

28: = FAK1_HUMAN_D0_4KAO_B =
SSBOND   2 CYS B  456    CYS B  459                          1555   1555  2.04

29: = FAK1_HUMAN_D0_4Q9S_A =
SSBOND   1 CYS A  456    CYS A  459                          1555   1555  2.03

30: = IKKB_HUMAN_D0_4E3C_E =
SSBOND   1 CYS E  114    CYS E  115                          1555   1555  2.05

31: = IKKB_HUMAN_D0_4E3C_F =
SSBOND   2 CYS F  114    CYS F  115                          1555   1555  2.05

32: = KS6A5_HUMAN_D1_3KN5_A =
SSBOND   1 CYS A  631    CYS A  714                          1555   1555  2.05

33: = KS6A5_HUMAN_D1_3KN5_B =
SSBOND   2 CYS B  631    CYS B  714                          1555   1555  2.06

34: = KS6A5_HUMAN_D1_3KN6_A =
SSBOND   1 CYS A  631    CYS A  714                          1555   1555  0.64

35: = KS6A5_HUMAN_D1_3KN6_B =
SSBOND   2 CYS B  631    CYS B  714                          1555   1555  2.06

36: = M3K7_HUMAN_D0_4L52_A =
SSBOND   1 CYS A   96    CYS A  101                          1555   1555  2.71

37: = MELK_HUMAN_D0_4IXP_A =
SSBOND   1 CYS A  154    CYS A  168                          1555   1555  2.03

38: = MKNK2_HUMAN_D0_2HW7_A =
SSBOND   1 CYS A  311    CYS A  314                          1555   1555  2.06

39: = PKN1_HUMAN_D0_4OTH_A =
SSBOND   1 CYS A  668    CYS A  768                          1555   1555  2.06

40: = SGK1_HUMAN_D0_3HDM_A =
SSBOND   1 CYS A  193    CYS A  258                          1555  12555  2.75

41: = TIE2_HUMAN_D0_2WQB_A =
SSBOND   1 CYS A 1040    CYS A 1118                          1555   1555  2.01

Analysis code: https://gist.github.com/danielparton/c825dec98c360b428de7

jchodera commented 9 years ago

Awesome summary! Thanks for putting this together.

Let's at least implement the "no disulfide bonds" option in addition to your current automated method. The manual method would also be nice, and we can do that now if not too difficult.

Let's definitely add a section to the paper on this, modeled after your "tentative thoughts" comments.