andrabujan / CATH-summer-2019

This repository contains materials related to superfamily naming and renaming
0 stars 0 forks source link

Chopping Issues #1

Open andrabujan opened 5 years ago

andrabujan commented 5 years ago

Superfamily 2.30.300.20 - the representative domain seems to be composed of two domains (D1 + D2 suggested in the literature); one of the domains is missing 3 beta-strands (which have been chopped as part of another domain (D4, which in the literature is suggested to be exclusively alpha-helical)

Superfamily 3.55.50.50 - this is the domain D4 that should have the 3 beta strands transferred to the other domain

kmatwani commented 5 years ago

Superfamily 3.30.300.230 - the suggested domain contains a chain with two helices (residues 145-174) which the literature suggests is a linker segment. 3.30.300.20 - same name suggested by Interpro (i.e. K Homology domain). Since this superfamily has more domain structures, and the papers affirm the name, it has been used for this superfamily instead. I suspect that if the linker segment is not included in the domain structure of the members of superfamily 3.30.300.230, they will be assigned to this superfamily.

andrabujan commented 5 years ago

3.10.430.100 and 3.40.5.10 share a helix (chopped in 2)

kmatwani commented 5 years ago

3.30.70.3220 - the suggested domain should be split in two. Each domain exhibits a ferredoxin-like fold.

andrabujan commented 5 years ago

4.10.910.10 - representative domain 2vqe (M) seems to be badly chopped

andrabujan commented 5 years ago

3.30.1550.10 - third PDB (1wib, chain A) seems to have a fragment that could be chopped off (since the other representative domains don't have it)

andrabujan commented 5 years ago

2.30.170.40 - 2jz6 (A) looks badly chopped

andrabujan commented 5 years ago

3.30.70.1730 - many of the representative domains seem to be badly chopped

kmatwani commented 5 years ago

1.20.1310.20 - chains 2xu0 and 2y8d seem badly chopped.

kmatwani commented 5 years ago

2.60.98.40 - 2 beta-sheets (residues 354-372) should not be part of domain.

kmatwani commented 5 years ago

1.25.40.620 - Should be divided into two domains.

kmatwani commented 5 years ago

2.40.50.670 - the domain does not include residues 15-19 according to the paper. They are part of a random coil.

kmatwani commented 5 years ago

3.90.70.40 - According to literature, the two helices should be included in the (Josephin) domain.

andrabujan commented 5 years ago

3.10.430.100/3.40.5.10 - the helix is suggested to be a separate linker domain, and the N-/C- terminus domains are the globular parts

andrabujan commented 5 years ago

3.30.1330.30 - 1gz0 (F) seems badly chopped

andrabujan commented 5 years ago

3.30.1390.40 - literature suggests it is included in the third (currently red) domain, forming the A domain

kmatwani commented 5 years ago

1.20.58.900 - the two helices not included in any domain should be part of the RUN domain.

kmatwani commented 5 years ago

3.30.420.380 - The representative domain should be divided into two according to literature.

kmatwani commented 5 years ago

3.40.50.11820 - The first two helices (313-345) do not seem to be well-packed with the rest of the domain. Literature suggests that it should be a separate domain.

andrabujan commented 5 years ago

1.10.1530.10 - representative domain 10 - 2x06 (A). Literature [PMID: 381418] suggests 3 domains - residues 179-218 should form a separate domain

andrabujan commented 5 years ago

3.30.1370.60 - representative domain 4h8a (A). Helical domain should be separated according to PMID: 16192274. Same with 1v9n (A) and 3i0p (A).

2g8y (A) also badly chopped - see other representative domains within the superfamily; helix should be included and helical domain separated

1nxu (B) badly chopped - one helix missing

1z2i (A) - same, one helix + one beta strand missing

3uoe (A) same

andrabujan commented 5 years ago

1.10.287.260 - suggesting a new chopping for 1aroP and 1mswD

kmatwani commented 5 years ago

1.10.472.140 - representative domain 4elj (A). Paper does suggest 2 domains, but the structure is further divided into 4 subdomains, which appear to be more compact [PMID: 22569856]. Suggested chopping added onto DomChop.

andrabujan commented 5 years ago

1.10.132.60 - the thumb domain could potentially be chopped in two domains - a four helix bundle and a beta-alpha-beta motif (it would structurally make sense because the two regions look quite separated); also, in some representative domains only the helical domain is shown, the rest has been chopped off. It is quite a large superfamily, and functionally it would indeed make sense to have a single domain (as the thumb domain) but I thought it would be worth pointing it out the fact that it could be further divided into 2 domains.

kmatwani commented 5 years ago

Superfamily 3.30.160.270 - Representative domains 3figA, 3hq1A, 3hpsA, 3hpzA, 1sr9A have three additional helices that are not very well packed and are not included in the other representatives, for example, 3hq1B. According to the literature, they form a linker subdomain [PubMed:15159544].

andrabujan commented 5 years ago

3.90.1600.10 - 3k5o (B) helix should be chopped off as it is not packed with the rest of the domain and literature also suggests that it's a linker domain (N-palm linker)

2p5o (D) badly chopped

kmatwani commented 4 years ago

1.10.287.130 - Some of the representative domains, for example, 2qm8 (A) include a third, long helix which should not be part of the domain.

kmatwani commented 4 years ago

Superfamily 3.40.50.10170 (temporarily named DegV, N-terminal domain): According to the literature, the representative structures can be broken down to form two distinct domains and there are superfamilies representing them (3.40.50.10440 represents the core domain and 2.20.28.50 represents the peripheral subdomain) [PMID: 19390149].

kmatwani commented 4 years ago

1.20.1260.30 - For representative 3ufb (A), the paper suggests that the N-terminal domain is from residues 12-193 instead of 12-170 [PubMed:23090406] (it is quite dicey because both the choppings seem valid, though I am leaning to what the paper suggests).

andrabujan commented 4 years ago

1.10.601.10 and 1.20.120.1810 could be merged.

When comparing the representative domains 3ugoA and 1sigA [http://cath-tools.cathdb.info/structure/pairwise] have quite a high SSAP Score 74.49, sequence identity 35%, equivalent residues 126 (out of 305), percent overlap 41%. I would suggest re-chopping 1sigA.

andrabujan commented 4 years ago

2.30.42.60 - representative domain 4jj0 (B) could be re-chopped (the terminal fragment/tail) and then merged with the bigger PDZ superfamily (2.30.42.10)

kmatwani commented 4 years ago

3.40.1190.20 - According to the literature, the structure can be divided into a core and lid subdomain (superfamily 3.30.1110.10). Some of the structures are chopped this way, while the others are not (I compared some structures that have been divided to two domains and some that have not, and the SSAP scores are above 70).