geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
219 stars 40 forks source link

condensin complexes #12832

Closed ukemi closed 3 years ago

ukemi commented 7 years ago

I am curating PMID:21795393 and according to this paper, there are two types of condensin complexes, condensin I and condensin II. Originally in GO we had a single subtype of condensin complex called 'nuclear condensin complex'. This complex is defined as 'A multisubunit protein complex that plays a central role in the condensation of chromosomes that remain in the nucleus.' Currently there are many annotations to this complex from direct annotation of yeast genes and Paint annotation. The problem with these annotations is that according to PMID:21795393, neither of the two condensin complexes is always restricted to the nucleus. Condensin II is in the nucleus and Condensin I associates with chromosomes after nuclear envelope breakdown. In closed mitosis the yeast annotations to the nuclear complex make sense because there is no nuclear envelope breakdown.

However, according to PMID:21795393 and PMID:15823530 the yeast condensin complex corresponds to condensin I, which based on the above does not always fit as a nuclear condensin. According to PMID:15823530, nematodes only have condensin II. Both papers suggest that when condensin I and condensin II exist, they differ functionally. I have created a new term for the condensin I complex that is a sibling of the nuclear condensin complex and defined it as attaching to the chromosomes after nuclear envelope breakdown, but this does not allow for the representation of the yeast condensin complex corresponding to condensin I. I would like to create a condensin II complex as a sibling as well, defining it as being nuclear at the initial stages of chromosome condensation and remaining attached to the chromosomes after nuclear envelope breakdown. It would also be nice to try to distinguish the complexes functionally, but it looks like it might vary among organisms.

It would be nice to have input from @vanaukenk for worms, @ValWood for yeast and @pgaudet

bmeldal commented 7 years ago

Apologies for chiming it!

Shouldn't function take priority over composition and location when defining complex terms? I would go function --> location --> composition as composition is the one that tends to vary the most across the tree of life.

However, it looks like here location is a real issue so I would focus on function. Are Condensin I and II doing different things? If yes, do I and II consistently have the same function across taxa? If not, shouldn't we relax the def of the current term? If people really need separate terms for Condensin I and II they could then be children of Condensin complex?

Complex, I know!

Birgit

PS: I've not done any condensin yet... but we have the yeast one in the CP: http://www.ebi.ac.uk/intact/complex/details/EBI-2658088

See also Paola'a new ticket "Merge nuclear and cytoplasmic versions of identical protein complexes? #12833". Paraphrasing Chris Mungall from the GOC meeting in Los Angeles, CC is essentially an ontology of locations, not functions, so to the extent that terms for complexes are CC children, locations are key differentia. So I Guess a complex gets two orthogonal but equal annotations, one CC to say where it is and one MF to say what it does there.

ValWood commented 7 years ago

I would be very happy to use the old nuclear/cytoplasmic condensin terms and define the complexes functionally. Also see comments here, same issue exists for cohesin, proteasome, etc. etc. https://github.com/geneontology/go-ontology/issues/12598

bmeldal commented 7 years ago

--> Sounds like we need a dedicated annotation call for these nuclear vs cytoplasmic complex issues. Someone should collect a list of examples in a separate ticket.

vanaukenk commented 7 years ago

Hi,

C. elegans actually has three condensin complexes: condensin I, condensin I(DC), and condensin II.

https://www.ncbi.nlm.nih.gov/pubmed/19119011

Condensins I and II are involved in chromosome segregation, while condensin I(DC) is involved in modulation of chromosome- and sex-specific gene expression.

I would like to do some more reading (the above paper is from 2009), but this would suggest that we need to relax the definition of a condensin complex to include its role in modulation of gene expression and then perhaps have three children of condensin complex: I, I(DC), and II. It would be nice to define the complexes functionally, if we can.

dosumis commented 7 years ago

Shouldn't function take priority over composition and location when defining complex terms?

If you call something a condensin complex, you are saying something about its composition. It might be possible to subdivide functionally though.

ukemi commented 7 years ago

Reading more and more about this, I am tempted to merge all of the children into 'condensin complex'. It seems that the I and II subtypes of the complexes are defined by evolutionary conservation of the protein components, but their functions varies widely. As near as I can tell, the only thing that is truly conserved is that they are all capable of part of chromosome condensation. However depending on the organism, they have non-overlapping roles and can take on a variety of other roles as well. The literature is interesting in that the papers I am seeing never definitively show that the same complex is functioning, but rather use a distinguishing member of the complex as a proxy. Birgit, what do you think? It seems like we might want to have complex I and complex II in your resource, but in GO we might only want a generic.

bmeldal commented 7 years ago

Hi David,

Can you bear with me til next week, I've got the cold from hell and haven't been at work this week - yet. Just doing my daily email check to avoid things building up! It's very unusual for me but I can't compute complex (!) biology at the moment. Hoping every day to feel a little better...

Have you got some PMIDs for me to check or should I just review this thread again?

Birgit

ukemi commented 7 years ago

Hi Birgit,

No worries, I am still pondering this myself. I imagine I won't do any ontology work on this until after the Holidays. In addition to the 2 PMIDs above, check out do get a flavor of what these can do:

PMID:16673016 PMID:21795393 PMID:25474630

I'm still reading papers.

pgaudet commented 3 years ago

@bmeldal @ukemi Any conclusion about this ?

We currently have those 4 terms: 'condensin I complex' http://purl.obolibrary.org/obo/GO_0061814 'condensin complex' http://purl.obolibrary.org/obo/GO_0000796 'condensin core heterodimer' http://purl.obolibrary.org/obo/GO_0000797 'nuclear condensin complex' http://purl.obolibrary.org/obo/GO_0000799

Thanks, Pascale

ValWood commented 3 years ago

I'm pretty sure there is only one condensin complex.

it has to have these 5 subunits for full functionality: https://www.pombase.org/reference/PMID:31072933 https://www.pombase.org/reference/PMID:30914423 https://www.pombase.org/reference/PMID:31615333 that performs all of the roles chromosome dynamics, including mitotic chromosome condensation and segregation, DNA repair, and development.

(unlike cohesin which has a mitotic version of the complex)

The core heterodimer has no function, although it can probably be isolated in some conditions.

bmeldal commented 3 years ago

Wow, that ticket has been dormant for a while! [Btw, that "cold from hell" I mentioned 4 years ago turned out to be whooping cough and turns out it's not unusual to get it again in "middle age" even if you had it as a child!]

In the meantime I curated human condensin (and forgot about this ticket) since then and decided on I and II based on http://europepmc.org/article/MED/17268547

Cohesin I

https://www.ebi.ac.uk/complexportal/complex/CPX-979

NCAPD2 Q15021 Condensin complex subunit 1 NCAPH Q15003 Condensin complex subunit 2 NCAPG Q9BPX3 Condensin complex subunit 3 SMC2 O95347 Structural maintenance of chromosomes protein 2 SMC4 Q9NTJ3 Structural maintenance of chromosomes protein 4

Cohesin II

https://www.ebi.ac.uk/complexportal/complex/CPX-985

NCAPD3 P42695 Condensin-2 complex subunit D3 NCAPH2 Q6IBW4 Condensin-2 complex subunit H2 NCAPG2 Q86XI2 Condensin-2 complex subunit G2 SMC2 O95347 Structural maintenance of chromosomes protein 2 SMC4 Q9NTJ3 Structural maintenance of chromosomes protein 4

I agree, SMC2-SMC4 core heterodimer doesn't have a separate function.

Birgit

pgaudet commented 3 years ago

Thanks @bmeldal

The actions could be:

What do you think ?

Thanks, Pascale

bmeldal commented 3 years ago

Could we have one Condensin GO term (GO:0000796) and have cohesin I and II as narrow synonyms? The def for the parent term is very generic so the slight functional differences btw I & II don't matter too much.

(typo corrected to condensin!)

ValWood commented 3 years ago

I agree, presumably they are doing the identical thing either in different cell types or embryonic or something? or is there some separation of function?

bmeldal commented 3 years ago

My defs (it's a few years ago now...), main difference in bold:

Condensin I: "Involved in chromosome segregation, both in meiosis and mitosis. Initially located in cytoplasm and associates with chromosomes after nuclear breakdown. Assembles in alternating pattern with Condensin II complex along metaphase chromosomes with fully resolved sister chromatids. Defects in Condensin complexes lead to anaphase bridges and apoptosis. In meiosis, only stably associates with chromosomes after anaphase I."

Condensin II: "Involved in chromosome condensation and segregation, both in meiosis and mitosis. Almost exclusively found in the nucleus where it assembles in alternating pattern with Condensin I complex along metaphase chromosomes with fully resolved sister chromatids. Defects in Condensin complexes lead to anaphase bridges and apoptosis. Also affects nuclear architecture and chromosome stability during interphase."

But the outcome is the same and I used the same MF and BP annotations but reflected the slight difference in the CC annotations.

ValWood commented 3 years ago

HI @bmeldal

The definitions should be fully contained in the part before the full stop, and based on the parent.

GO:0000796 condensin complex is defined A multisubunit protein complex that plays a central role in chromosome condensation.

The descendants should be formulated A condensin complex that blah (add differentia here).

Other information which is common to both complexes should be included the parent term if relevant.

I don't think we should include any phenotype related gloss "Defects in Condensin complexes lead to anaphase bridges and apoptosis"

I'm also a bit dubious about including "initially located in the cytoplasm" (is this definitely the entire complete, or just subunits?), and I wonder about the relevance of this because the complex isn't functional until loaded onto DNA (or is it? ), or is there a known cytosolic role?

ValWood commented 3 years ago

Ah actually, we already decided not to include didn't we.. Doh... you were just telling me what is known about the difference.

pgaudet commented 3 years ago

If I understand correctly @bmeldal is suggesting to keep a single term ? So, there would be no descendants... @ValWood Do you want to keep the descendants ?

bmeldal commented 3 years ago

Yes, @ValWood those are the CP defs for the 2 human complexes but for GO we only need the parent term.

I'll check the cytosol bit, it sounds a bit strange but the entry was also reviewed by a second curator...

bmeldal commented 3 years ago

@pgaudet are you doing the merge, then?

pgaudet commented 3 years ago

Yes - done - hopefully that was right !

bmeldal commented 3 years ago

@ValWood re: location

http://europepmc.org/article/MED/25474630

"CAP-H was mainly detectable in the cytoplasm whereas CAP-H2 localized to the nucleus in NSCs during interphase, in agreement with previous reports using tissue culture cells [16] and mouse oocytes [17]."

Looks like it's just individual subunits. I will take the cytoplasm reference out of the CP entries as I think you are right, it's probably pre-assembly.