Open andrabujan opened 5 years ago
Superfamily 3.30.300.230 - the suggested domain contains a chain with two helices (residues 145-174) which the literature suggests is a linker segment. 3.30.300.20 - same name suggested by Interpro (i.e. K Homology domain). Since this superfamily has more domain structures, and the papers affirm the name, it has been used for this superfamily instead. I suspect that if the linker segment is not included in the domain structure of the members of superfamily 3.30.300.230, they will be assigned to this superfamily.
3.10.430.100 and 3.40.5.10 share a helix (chopped in 2)
3.30.70.3220 - the suggested domain should be split in two. Each domain exhibits a ferredoxin-like fold.
4.10.910.10 - representative domain 2vqe (M) seems to be badly chopped
3.30.1550.10 - third PDB (1wib, chain A) seems to have a fragment that could be chopped off (since the other representative domains don't have it)
2.30.170.40 - 2jz6 (A) looks badly chopped
3.30.70.1730 - many of the representative domains seem to be badly chopped
1.20.1310.20 - chains 2xu0 and 2y8d seem badly chopped.
2.60.98.40 - 2 beta-sheets (residues 354-372) should not be part of domain.
1.25.40.620 - Should be divided into two domains.
2.40.50.670 - the domain does not include residues 15-19 according to the paper. They are part of a random coil.
3.90.70.40 - According to literature, the two helices should be included in the (Josephin) domain.
3.10.430.100/3.40.5.10 - the helix is suggested to be a separate linker domain, and the N-/C- terminus domains are the globular parts
3.30.1330.30 - 1gz0 (F) seems badly chopped
3.30.1390.40 - literature suggests it is included in the third (currently red) domain, forming the A domain
1.20.58.900 - the two helices not included in any domain should be part of the RUN domain.
3.30.420.380 - The representative domain should be divided into two according to literature.
3.40.50.11820 - The first two helices (313-345) do not seem to be well-packed with the rest of the domain. Literature suggests that it should be a separate domain.
1.10.1530.10 - representative domain 10 - 2x06 (A). Literature [PMID: 381418] suggests 3 domains - residues 179-218 should form a separate domain
3.30.1370.60 - representative domain 4h8a (A). Helical domain should be separated according to PMID: 16192274. Same with 1v9n (A) and 3i0p (A).
2g8y (A) also badly chopped - see other representative domains within the superfamily; helix should be included and helical domain separated
1nxu (B) badly chopped - one helix missing
1z2i (A) - same, one helix + one beta strand missing
3uoe (A) same
1.10.287.260 - suggesting a new chopping for 1aroP and 1mswD
1.10.472.140 - representative domain 4elj (A). Paper does suggest 2 domains, but the structure is further divided into 4 subdomains, which appear to be more compact [PMID: 22569856]. Suggested chopping added onto DomChop.
1.10.132.60 - the thumb domain could potentially be chopped in two domains - a four helix bundle and a beta-alpha-beta motif (it would structurally make sense because the two regions look quite separated); also, in some representative domains only the helical domain is shown, the rest has been chopped off. It is quite a large superfamily, and functionally it would indeed make sense to have a single domain (as the thumb domain) but I thought it would be worth pointing it out the fact that it could be further divided into 2 domains.
Superfamily 3.30.160.270 - Representative domains 3figA, 3hq1A, 3hpsA, 3hpzA, 1sr9A have three additional helices that are not very well packed and are not included in the other representatives, for example, 3hq1B. According to the literature, they form a linker subdomain [PubMed:15159544].
3.90.1600.10 - 3k5o (B) helix should be chopped off as it is not packed with the rest of the domain and literature also suggests that it's a linker domain (N-palm linker)
2p5o (D) badly chopped
1.10.287.130 - Some of the representative domains, for example, 2qm8 (A) include a third, long helix which should not be part of the domain.
Superfamily 3.40.50.10170 (temporarily named DegV, N-terminal domain): According to the literature, the representative structures can be broken down to form two distinct domains and there are superfamilies representing them (3.40.50.10440 represents the core domain and 2.20.28.50 represents the peripheral subdomain) [PMID: 19390149].
1.20.1260.30 - For representative 3ufb (A), the paper suggests that the N-terminal domain is from residues 12-193 instead of 12-170 [PubMed:23090406] (it is quite dicey because both the choppings seem valid, though I am leaning to what the paper suggests).
1.10.601.10 and 1.20.120.1810 could be merged.
When comparing the representative domains 3ugoA and 1sigA [http://cath-tools.cathdb.info/structure/pairwise] have quite a high SSAP Score 74.49, sequence identity 35%, equivalent residues 126 (out of 305), percent overlap 41%. I would suggest re-chopping 1sigA.
2.30.42.60 - representative domain 4jj0 (B) could be re-chopped (the terminal fragment/tail) and then merged with the bigger PDZ superfamily (2.30.42.10)
3.40.1190.20 - According to the literature, the structure can be divided into a core and lid subdomain (superfamily 3.30.1110.10). Some of the structures are chopped this way, while the others are not (I compared some structures that have been divided to two domains and some that have not, and the SSAP scores are above 70).
Superfamily 2.30.300.20 - the representative domain seems to be composed of two domains (D1 + D2 suggested in the literature); one of the domains is missing 3 beta-strands (which have been chopped as part of another domain (D4, which in the literature is suggested to be exclusively alpha-helical)
Superfamily 3.55.50.50 - this is the domain D4 that should have the 3 beta strands transferred to the other domain