geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
219 stars 40 forks source link

NTR: [histone H3-K27 methyltransferase activator activity] #28326

Open krchristie opened 2 months ago

krchristie commented 2 months ago

Please provide as much information as you can:

pgaudet commented 1 month ago

HI @krchristie

I looked at this with @colinlog , and he notes that EED is a core subunit of the complex, therefore it cannot act as a regulator. Since it's a core subunit, it's not surprising that it would be required for the activity of the histone methylase. EED's activity is rather a scaffold; right now we don't have any term more specific than the top-level of that branch, protein-macromolecule adaptor activity. I suggest you use this with the appropriate input(s).

I hope this works for you.

krchristie commented 1 month ago

HI @krchristie

I looked at this with @colinlog , and he notes that EED is a core subunit of the complex, therefore it cannot act as a regulator. Since it's a core subunit, it's not surprising that it would be required for the activity of the histone methylase. EED's activity is rather a scaffold; right now we don't have any term more specific than the top-level of that branch, protein-macromolecule adaptor activity. I suggest you use this with the appropriate input(s).

I hope this works for you.

Actually, I don't think this does work very well.

I keep finding that the lack of specific terms in MF is going to mean that enrichment analyses will become meaningless. I think we need MF terms that provide some level of specificity with respect to what the gene product is doing. Only putting the specificity in the inputs doesn't provide much information unless the inputs are actually displayed on websited, used in enrichment analyses, etc.

@vanaukenk - Can we discuss this type of issue an at annotation call, probably in the fall once the summer vacations are over.

ValWood commented 1 month ago

I'm semi-agnostic as to whether we include the specific terms. However, this shouldn't affect enrichments because enrichments work on gene sets of a significant number of members (i.e. pathways), and the regulator terms are leaf nodes with single annotations or a couple at most. We need to annotate to the process(pathway) i.e. GO:0045815 transcription initiation-coupled chromatin remodeling or whatever, to enable enrichments. To illustrate you will notice that MF terms other than 'protein binding' or occasionally transcription regulators never appear in enrichments (and when they do, there should always be a more informative process enriched).

I can see the advantage for users to see the precise term label without having to read the substrate and a disadvantage for GO editors to replicate all of the functions that require specific regulators in the "regulator branch". so it is a useful discussion to have.

krchristie commented 4 weeks ago

I'm semi-agnostic as to whether we include the specific terms. However, this shouldn't affect enrichments because enrichments work on gene sets of a significant number of members (i.e. pathways), and the regulator terms are leaf nodes with single annotations or a couple at most. We need to annotate to the process(pathway) i.e. GO:0045815 transcription initiation-coupled chromatin remodeling or whatever, to enable enrichments. To illustrate you will notice that MF terms other than 'protein binding' or occasionally transcription regulators never appear in enrichments (and when they do, there should always be a more informative process enriched).

I agree that the leaf nodes themselves aren't the terms that will show up in enrichments, but I think that representing the MF with something as non-specific as "protein-macromolecule adaptor activity" is not going to be useful. If we create specific MF terms then we will likely also have a structure of terms such that there could some intermediate term that does become enriched and provides more information that just "protein-macromolecule adaptor activity"

I can see the advantage for users to see the precise term label without having to read the substrate and a disadvantage for GO editors to replicate all of the functions that require specific regulators in the "regulator branch". so it is a useful discussion to have.

Most interfaces do not display the extension/input in a user friendly way. In addition, it seems that the GO editors have decided that we need to represent the mechanistic aspects of functions with MF terms. If that's the decision, it seems that we need to have meaningful MF terms. I feel that replacing the obsoleted BP term that provided very specific information that allowed users to see immediately that it Eed is involved in histone H3K27 methylation with such a vague MF term as "protein-macromolecule adaptor activity" is not a user friendly change (follow red arrow in pic).

Eed-effectOfNonSpecMF-arrow

krchristie commented 4 weeks ago

HI @krchristie I looked at this with @colinlog , and he notes that EED is a core subunit of the complex, therefore it cannot act as a regulator. Since it's a core subunit, it's not surprising that it would be required for the activity of the histone methylase. EED's activity is rather a scaffold; right now we don't have any term more specific than the top-level of that branch, protein-macromolecule adaptor activity. I suggest you use this with the appropriate input(s).

@pgaudet - I appear to not be the only curator to select the term "enzyme activator activity" (see orange term in pic in previous comment). One of those annotations (the ISO) is my my recent reannotation of the obsoleted BP term (in red), but the ISS was made by a non-MGI curator. I think that previous annotation may have contributed to my choice of this term for reannotation.