geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
217 stars 40 forks source link

NTR: [Yamanaka factors] #25099

Closed smoe closed 1 year ago

smoe commented 1 year ago

Please provide as much information as you can:

I am not completely sure about the next one - I was primarily looking for rejuvenation, which GO does not yet have, and am not sure that it should, although it would be rather timely. https://www.ebi.ac.uk/QuickGO/term/GO:0042246 tissue regeneration

Any other information Wikipedia pointed out that the discovery of those factors (and the induction of pluripotency with them) was rewarded with the 2012 nobel price. It could be an interesting side-theme to link GO terms to prices, much like other references.

Could the Yamanaka factors possibly be a "part of" (not a subclass) the induction of pluripotency (even though a functionally complete part) and also be a part of the "regeneration"? Should I also ask for a new term "rejuvenation"? I found "GO:0010259 Multicellular Aging" - it may need some extra thinking though to adopt the GO-typical "negative/positive regulation" concepts to aging. Anyway - I am curious to hear what you think about it all.

raymond91125 commented 1 year ago

The "Yamanaka factors" refers to transgenes used to artificially induce adult cells into pluripotency. I would think this is out-of-scope for GO. GO:0043697 cell dedifferentiation has a note: "Note that this term should be used to annotate gene products involved in dedifferentiation that occurs as part of a normal process, such as regeneration. It should not be used for dedifferentiation that occurs in an abnormal or disease state such as cancer". UniProt has annotations like: "Regulator of somatic reprogramming, controls self-renewal of embryonic stem cells". Are reprogramming of somatic cells [PMID: 22917226] and self-renewal of embryonic stem cells [PMID:36013330] within the GO scope given that they are ex vivo?

smoe commented 1 year ago

I concur that by excluding cancer, cell dedifferentiation may not be a good parent. GO:0043696 dedifferentiation?

For me, the Yamanaka genes triggered physiologically would perform equally, just that those events are difficult to monitor/have access to/get in sufficient abundance to perform transciptomics/has ethical issue in vivo, so they are commonly triggered in vitro. WIthout transfections, these genes are in particular discussed in cancer research to understand cancer stem cells (CSC) https://www.tandfonline.com/doi/abs/10.1080/14737140.2021.1915137, which is a pathophysiology that does not need external triggers. And while GO:0043697 cell dedifferentiation excludes cancer, I frankly do not think that this what was meant since nothing is broken into disrepair but the dedifferentiation is auto-triggered via those factors. That aside, I consider it to be very important to be pointed to Yamanaka factors in a gene set enrichment analysis, or when characterizing populations in single-cell RNA-seq, as these factors are discussed whenever cellular reprograming is a thing. And finally, it is all quite fascinating and I would not want GO to miss out on these tantalizing set of human genes. If deciding in favor of the term's acceptance, maybe there are some more unexpected associations coming up.

ValWood commented 1 year ago

Yamanaka factors seem to be describing a collection of gene products: (suggested definition: Set of four genes (MYC, Oct3/4 (POU5F1), SOX2 and KLF4)

rather than a process. Why not just annotate them to "cell dedifferentiation"?

Note that a GO definition should be constructed from the parent definition plus a differentia, so a subclass of "cell dedifferentiation" would begin "A cell differentiaon process where blah", so in this case to create a child term, you need to say what is different about this process from its suggested parent.

v

pgaudet commented 1 year ago

Seems like 'cell dedifferentiation' covers the process you are trying to describe. I wonder if we can add a (narrow) synonym like 'Yamanaka factors-induced pluripotency' to 'cell dedifferentiation' , which would guide users to the correct term?

smoe commented 1 year ago

There are too many ways to induce some level of pluripotency for too many tissues, I would not want to see those factors as an alias. Also, I understand those factors only as a trigger, while the GO term likely also (or primarily?) aims at addressing downstream events.

I agree that it is somehow wrong to have a GO term defined by an explicit set of genes, i.e. it is not special to have just four genes assigned to a term but the definition should be their effect, not the constellation of genes. But then it occurred to me that GO already has complexes, checkpoints and signals, that are defined around single genes (or complexes). Instead of a complex of interacting proteins we here now have a complex interaction of transcription factors. Yes, I know, two very different meanings of the word "complex" and "interaction", but I thought this kind of helps to find an analogy.

For instance, the term Mre11 complex assembly is a biological process, and the MAD1 complex is a cellular component. I think I may be hoping for a process "Yamanaka-factors-induced cellular reprograming" which the likely evolves to contain more than the initial four genes, and starts off as a child-term to "cell dedifferentiation".

raymond91125 commented 1 year ago

GO's scope is limited to normal (derived from natural evolution) processes. I think a term like "Yamanaka-factors-induced cellular reprograming" is not about any normal process but artificially manipulated one, unless I missed something. Thus, I'm afraid we cannot add this term. With respect to their involvement in normal biological processes, there are indeed many annotations for those genes. Those annotations may point to some shared properties that underlie the 'reprogramming' effects.

smoe commented 1 year ago

Quick side-story: I have asked chatGPT if the Gene Ontology should dedicate a term for Yamanaka factors and it said that this may not be required [ :-( ]. And when I then asked which of its terms would be most similar, I was pointed to "cellular reprogramming" - but that is not a GO term [felt a bit of a relief - chatGPT was wrong].

To react to your concern of those factors being described only in a synthetic context, could we not possibly have Yamanaka factors presented as a drug, and GO describing the response to that drug? http://amigo.geneontology.org/amigo/search/ontology?q=drug may server as some prior art in GO. So this would then be something like "Respose to Yamanaka-factors" and both cell dedifferentiation could and drug response or the response to a xenobiotic stimulus could be parents.

raymond91125 commented 1 year ago

Sorry, I don't see a valid way to have Yamanaka factors as a GO term at this time.

ValWood commented 1 year ago

@smoe is the reason for asking for this term related to enrichment? If so, I notice that human has only 6 annotations to the suggested 'dedifferentiation' parent from any evidence source. It seems likely that "Yamanaka-factors" should be annotated to this term too (based on your suggested parent) (would then be 4/10 of all annotations). If this is the case, I suggest that you e-mail the Uniprot helpdesk with the genes and the publications and ask them to make the relevant GO annotations to the existing terms.

Re gene product specific GO term, we are in the process of removing these from GO or refactoring/ merging them into the correct functions/processes, so we would not want to add more gene product-specific terms unless they are required to describe a specific mechanism. CC terms are more of a special case and are often named after one specific gene product to differentiate them.

smoe commented 1 year ago

@ValWood Yes, I want researchers to be pointed to dedifferentiation when they are not expecting it, for instance when this goes together with senescence or tissue development, especially now that single-cell sequencing gets more wide-spread. I am not ultimately confident that the dedifferentiation term is strengthened by adding those initiators - "initiation of dedifferentiation" Yamanaka factors are doing, and that on the very highest, i.e. embryonic, level, but little else I can tell, and all those many specialized tissues likely have their own mechanism to dedifferentiate and with local initiators that are suitably less dramatic in their effect. In the Multi-Experiment-Matrix the factors are not correlating with themselves, so adding those four to the existing term would likely change the semantics of whatever is currently represented. And that does not necessarily be a correction.

I think you will want to evolve the "dedifferentiation" term in GO (pun intended) once that molecular patterns for cellular dedifferentiation are better understood: With the "Yamanaka factors" I am offering an idea for such a differentiation of the dedifferentiation term, and likely there will be many more coming. And once it is understood where those Yamanaka factors are too artificial to warrant their inclusion with GO, you improve on that "biggest blunder" of yours.

There is another side to it, though. I sense that GO should allow the biological/preclinical researchers (and soon also the precision doctors - or whatever you call the individuals executing precision medicine) to substitute their wording with terms in GO when they talk about about genes and their (I like that this is an ambiguous reference to doctors and genes - I meant genes but also the transfected genes by the doctors gene therapy) doing. And those Yamanaka factors are a thing in current lab talk - to be substituted with whatever better comes along the way, but for the past decade that term is doing well - increasingly well. I do not see the clean-up of your Gene/GeneSet-centric earlier GO terms as anything negative, in the contrary, it is how you are helping the precision of our scientific discourse, as in "this is why we need you so much". The term "Yamanaka factors" only sticks because there is no better understanding, yet. So, if you could have them in now and improve this with what they should truly be called in say 10 years from now, then I just want to praise you for it.

So, as my thinking evolved during this thread, today I think "Response to Yamanaka-factors" would be my favorite GO term - and that term would not be assigned those four genes that define it. Need to make some extra reading to decide what genes to assign.