geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
223 stars 40 forks source link

Remove all single-step BP classes #12859

Closed selewis closed 2 years ago

selewis commented 7 years ago

Redundant with the one function they are linked to. They add no value and actually create noise in enrichment analyses.

RLovering commented 7 years ago

So does this mean phosphorylation will be removed?

ukemi commented 7 years ago

That's the proposal.

RLovering commented 7 years ago

I can see the value of doing this.

Would it be possible to find out how many proteins (kinases) there are that currently have phosphorylation as their only BP annotation?

Presumably this mean that the child terms like tyrosine phosphorylation will also be removed? What about more complex terms like GO:1990802 protein phosphorylation involved in DNA double-strand break processing will this stay or change to protein kinase activity involved in DNA double-strand break processing? Currently this term has the parents protein phosphorylation, phosphorylation and GO:0006464 cellular protein modification process (as well as the DNA repair domain). So will GO:0006464 cellular protein modification process remain? If so will this be the new parent to protein kinase activity involved in DNA double-strand break processing?

selewis commented 7 years ago

Presumably all of these have a corresponding function term that exists. The suggested (auto) replacement would be the function term. If the function term has not been created, then it may be necessary to create one. Likewise, the labels of the functions could/should be changed to something more biologically intuitive. That is, the function term may well adopt the current label used by the corresponding (redundant) process term.

-S

On Thu, Dec 8, 2016 at 5:17 AM, Ruth Lovering notifications@github.com wrote:

I can see the value of doing this.

Would it be possible to find out how many proteins (kinases) there are that currently have phosphorylation as their only BP annotation?

Presumably this mean that the child terms like tyrosine phosphorylation will also be removed? What about more complex terms like GO:1990802 protein phosphorylation involved in DNA double-strand break processing will this stay or change to protein kinase activity involved in DNA double-strand break processing? Currently this term has the parents protein phosphorylation, phosphorylation and GO:0006464 cellular protein modification process (as well as the DNA repair domain). So will GO:0006464 cellular protein modification process remain? or will this be the new parent to protein kinase activity involved in DNA double-strand break processing?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/geneontology/go-ontology/issues/12859#issuecomment-265726958, or mute the thread https://github.com/notifications/unsubscribe-auth/ABcuEGkZu46-UL5QdDd4MkbY7gEDgdIaks5rF_VagaJpZM4LHhSz .

ValWood commented 7 years ago

In principle it's a good idea. We are aiming not to use these terms in annotation if we don't need to, and we don't display them unless they add anything.

This is along the lines I was suggesting at the GO meeting (at least a more conservative aspect of this which might be a good starting point).

I would love this to happen.....

ValWood commented 7 years ago

I'm sure there are ways around all of these hurdles. I'll summarize what we have been doing to work towards not using these terms for direct annotation. Then we can see the annotation situations where we needed to use them. Not today....

ukemi commented 7 years ago

And I suspect that there are at least some annotations to phosphorylation that are based on mutant phenotypes and the gene product has not been shown to be a kinase. I think when we get rid of these terms, the most conservative route will be obsoletion rather than merging them into the corresponding function term.

RLovering commented 7 years ago

Hi All

I have just annotated a protein (CAMKK2, Q96RR4) to positive regulation of protein phosphorylation with the extension: has_input Q13131, PRKAA1, based on PMID: 26103054. In the experiment (fig 7F) siRNA knockdown of CAMKK2 leads to a decrease in PRKAA1 phosphorylation. Both of these proteins are kinases but due to the nature of signalling pathways I do not make the annotation CAMKK2 protein kinase activity because this is not actually demonstrated. Plus I do not use 'has_direct_input' because this is not demonstrated either (though it is implied). Although the problem with this is that it is more likely that CAMKK2 is directly phosphorylating PRKAA1.

I think there are a lot of cases where kinases are manually annotated to regulation of phosphorylation not to kinase activity due to the lack of direct assays and the high number of assays like the one above.

The good thing was that in this situation I was able to annotate to GO:0061762 CAMKK-AMPK signaling cascade, which is definitely much more informative than regulation of phosphorylation. However, I expect that this will mean making a lot of signaling pathways like this. It would be good to suggest curators revised the annotations to general terms such as GO:0035556 intracellular signal transduction other signaling pathway terms if the specific terms do not exist. The parents terms that have been assigned to GO:0061762 CAMKK-AMPK signaling cascade suggest that it might not be straight-forward to create all new signaling terms. In addition, it will not be possible to include the target in the AE field because of a signal transduction GO term cannot be extended by adding the specific target.

Just looked at the human annotations: 843 kinases identified by InterPro as having a kinase function 571 proteins identified by InterPro as having a kinase function are annotated manually to kinase and phosphorylation 267 interpro kinases NOT annotated manually kinase 5 interpro kinases not annotated manually to kinase but annotated to phosphorylation

so looks like it is not a major problem for human

Ruth

ukemi commented 7 years ago

Yup. See my comment above. I know I have done this and I have also annotated to phosphorylation when more than one residue on a protein are phosphorylated and the multiple phosphorylation has a functional consequence. These will all need to go.

hattrill commented 7 years ago

I have a few questions relating to this proposal:

  1. Is there a complete list of single-step BP classes that we can view?
  2. What is the benefit of obsoletion over "not to be be used in manual annotation"?
  3. Are these terms damaging? Are these useful to some users as high-level grouping terms? And what about the even-more vague parent terms of these terms? - where is the cut-off?

My main concern is the balance of cost/value in committing ourselves to retrofitting this data if we go down the obsoletion route rather than the "do not use". Even if we can computational "guess" the intent of the annotation, there is a potentially a lot work for curators in revisiting old annotations.

RLovering commented 7 years ago

is there a link to the proposal or minutes from the meeting where this was discussed

rachhuntley commented 7 years ago

So it's not clear from this discussion whether regulation of single-step processes will also be removed. Can you clarify whether this is the case? Surely regulation of a function is not a single step process?

selewis commented 7 years ago

On Mon, Dec 12, 2016 at 3:29 AM, Helen Attrill notifications@github.com wrote:

I have a few questions relating to this proposal:

  1. Is there a complete list of single-step BP classes that we can view?
  2. What is the benefit of obsoletion over "not to be be used in manual annotation"?

The fact that the linkages are built into the ontology is more of the issue. Even without making the annotation the inferences are there.

  1. Are these terms damaging? Are these useful to some users as high-level grouping terms? And what about the even-more vague parent terms of these terms? - where is the cut-off?

Yes, they are damaging in that they introduce redundancy (effectively whether you annotate to the function or to the process or to both these all are saying exactly the same thing)

My main concern is the balance of cost/value in committing ourselves to retrofitting this data if we go down the obsoletion route rather than the "do not use". Even if we can computational "guess" the intent of the annotation, there is a potentially a lot work for curators in revisiting old annotations.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/geneontology/go-ontology/issues/12859#issuecomment-266393579, or mute the thread https://github.com/notifications/unsubscribe-auth/ABcuEHE7WgV0Y_AFzO0pGnFC6gtEbFmcks5rHSH3gaJpZM4LHhSz .

selewis commented 7 years ago

Regulation of a function (gp1 regulating gp2) is itself a single function.

On Mon, Dec 12, 2016 at 6:58 AM, Rachael Huntley notifications@github.com wrote:

So it's not clear from this discussion whether regulation of single-step processes will also be removed. Can you clarify whether this is the case? Surely regulation of a function is not a single step process?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/geneontology/go-ontology/issues/12859#issuecomment-266437179, or mute the thread https://github.com/notifications/unsubscribe-auth/ABcuEBWbvHMEmJqHZjBHbQ7k1gxz_Xx5ks5rHVL7gaJpZM4LHhSz .

selewis commented 7 years ago

Also, the single step could well be a step in a multi-step process and this is fine. In the ontology the 'part-of' links would remain. These aren't high level terms.

On Mon, Dec 12, 2016 at 11:55 AM, Suzanna Lewis selewis@lbl.gov wrote:

On Mon, Dec 12, 2016 at 3:29 AM, Helen Attrill notifications@github.com wrote:

I have a few questions relating to this proposal:

  1. Is there a complete list of single-step BP classes that we can view?
  2. What is the benefit of obsoletion over "not to be be used in manual annotation"?

The fact that the linkages are built into the ontology is more of the issue. Even without making the annotation the inferences are there.

  1. Are these terms damaging? Are these useful to some users as high-level grouping terms? And what about the even-more vague parent terms of these terms? - where is the cut-off?

Yes, they are damaging in that they introduce redundancy (effectively whether you annotate to the function or to the process or to both these all are saying exactly the same thing)

My main concern is the balance of cost/value in committing ourselves to retrofitting this data if we go down the obsoletion route rather than the "do not use". Even if we can computational "guess" the intent of the annotation, there is a potentially a lot work for curators in revisiting old annotations.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/geneontology/go-ontology/issues/12859#issuecomment-266393579, or mute the thread https://github.com/notifications/unsubscribe-auth/ABcuEHE7WgV0Y_AFzO0pGnFC6gtEbFmcks5rHSH3gaJpZM4LHhSz .

pgaudet commented 7 years ago

One problem I see with this is the 'directness' of the function versus process annotation. It used to be OK to annotate a kinase to protein kinase activity, which implies phosphorylation. It also used to be OK to annotate a non-kinase to phosphorylation, for example a (non-kinase) receptor. The latter example would be more correct if it were annotated to 'signaling' but we cannot make rules for all the one-step processes.

dosumis commented 7 years ago

It also used to be OK to annotate a non-kinase to phosphorylation, for example a (non-kinase) receptor.

Can you give an example? Not clear to me why this would have ever been allowed - rather than using regulation term.

ukemi commented 7 years ago

Mary has done a query, there are over 2000 bioentities annotated to phosphorylation or one of its children that are not annotated to kinase or one of its children. We should discuss this at the GOC meeting or on an annotation call. @mdolanme

ValWood commented 7 years ago

Hi,

Are these all experimental? Also how many of them are MF "kinase regulator activity" (or are these excluded?) cheers val

ukemi commented 7 years ago

These are all experimental. The next step will be for @mdolanme to see how many of the bioentities that are annotated to phosphorylation and do not have experimental evidence annotating them to kinase activity have IEA evidence supporting that they are kinases. We are proposing that for this ticket will be a topic of discussion at the GOC meeting.

mdolanme commented 7 years ago

In response to Val's last comment:

Of the 3932 gene products with experimental annotation to 'phosphorylation': 2380 do not have experimental annotation to 'kinase activity'. Of those 2380, 240 have non-experimental annotation to 'kinase activity' (and 2140 do not). Of the 3932 gene products with experimental annotation to 'phosphorylation': 216 have experimental annotation to 'kinase regulator activity' and include 174 of the 2140 without annotation to 'kinase activity'; 169 have non-experimental annotation to 'kinase regulator activity' and include 71 additional gene products of the 2140 without any annotation to 'kinase activity'.

Summary: Of the 3932 gene products with experimental annotation to 'phosphorylation': 1895 gene products have no annotation to either 'kinase activity' or 'kinase regulator activity'.

Mary

ValWood commented 7 years ago

Hi Mary,

Note also that some signalling pathway terms have a "protein phosphorylation parentage"

For example "MAP kinase cascade" and its =/ve -ve regulation. If these terms disappeared these annotations could be preserved with alternative parentage. So, for evaluation purposes these annotation could also be excluded.

I have been trying to purge any direct annotations to fission yeast "phosphorylation" terms (and other single step processes) for a while, and represent in a more LEGO compliant: "MF target x, involved in BP" I don't yet have any examples where this didn't work.

ValWood commented 7 years ago

also glycolytic process, see: https://github.com/geneontology/go-ontology/issues/12967 appears to be a legitimate example of a phosphorylation annotation which might not apply to a kinase or a kinase regulator.

and: mitochondrial ATP synthesis coupled proton transport mitochondrial electron transport, ubiquinol to cytochrome c

If the phosphorylation terms disappeared these annotations would still be supported by alternative parentage, so you could exclude.

ukemi commented 7 years ago

The plan on this ticket is to present the data analyses and rationale for keeping or obsoleting these terms at the GOC meeting in June. Please continue to add discussion points to this ticket and we will use it for a guide to presenting at the meeting.

vanaukenk commented 7 years ago

Hi,

We'd like to discuss this item on the upcoming annotation conference call: http://wiki.geneontology.org/index.php/Annotation_Conf._Call_2017-03-14

Is there a list somewhere of the proposed BP terms that would be obsoleted so curators can assess their annotations? @thomaspd @selewis @dosumis

Thx.

ValWood commented 7 years ago

It would be great to see a list of all of the terms, and the experimental annotation numbers.

I have managed to remove most of our direct annotations if they have a more specific MF annotation, but we still have some annotations where only a process annotation was appropriate. ...Can provide examples...

vanaukenk commented 7 years ago

Yes, @ValWood ; if you have examples where you guys thought that only a process annotation was appropriate, it'd be great to have a look at those.

hattrill commented 7 years ago

Hi Kimberley, this paper gave us some trouble: PMID:19088085 PP4 (Pp4-19C) and PP2A (mts) regulate Hedgehog signalling by controlling Smo and Ci phosphorylation. Basically, although it deals with two phosphatase and the (dephosphorylation or lack of), the authors would only commit to saying that PP4 was acting as a phosphatase and postulated that PP2A was perhaps acting to inhibited phosphorylation. In the end, as the assays themselves didn't seem especially different, we had to go with what the authors said: so PP4 was annotated with ‘protein serine/threonine phosphatase activity’ and PP2A ‘negative regulation of protein phosphorylation’. It wasn't very satisfactory, but it would have been wrong to point users to this as a source of evidence for phosphatase activity for PP2A. On a gene-by-gene basis, I can make sure that all protein kinases have ‘protein kinase activity’. On a paper-by-paper basis, we come across issues where the assays are not direct enough to ascribe the MF, so a process term has to stand in - I can’t always say via an IMP that a kinase is directly responsible for the phosphorylation of a protein and it would not be correct to point the user to this paper as containing evidence for such in our gene pages.

I am curoius about where would we draw the line: would terms like 'protein ubiquitination' come under this: in a way it is binary - protein not ubiquitinated to protein ubiquitinated, but this is multistep in that > enzyme is involved. Whereas, protein deubiquitination is a one-step process for ubiquitinyl hydrolase activity?

ValWood commented 7 years ago

I can pull some out but I'm unlikely to be able to get to it before the call...

mah11 commented 7 years ago

I've found a few examples where we've used BP annotations for various reasons. I'm including read-only Canto links in case you want to see the details.

Three examples using GO:0001934 ! positive regulation of protein phosphorylation:

PMID:17189249 A tel1 tel2 (unknown MF) knockdown leads to decreased phosphorylation of Mrc1, but nothing in the paper indicates which kinase(s) do the job. For this paper we could probably get away with just the phenotype annotation if GO:0001934 went obsolete.

https://curation.pombase.org/pombe/curs/6d2d5b10bcbc0a55/ro/

PMID:21680738 In an ins1 null, Hmg1 isn't phosphorylated as it normally is, but the kinase is not shown; there's no evidence that Ins1 (or the human ortholog Insig) has kinase activity. Could this be captured in LEGO with "some kinase activity enabled by some (unidentified) gene product"? If so, how would it be translated into GAF or GPAD?

(There's evidence in PMID:19041767, which we have not yet curated, that Sty1 is involved but may act upstream rather than directly on Hmg1.)

https://curation.pombase.org/pombe/curs/6d2d5b10bcbc0a55/ro/

PMID:12604790 Dfp1-Hsk1 (ortholog of Dbf4p/Cdc7p) phosphorylates Mcm2. In the paper it's assayed in vitro with either Mcm2 alone or with the MCM hexamer. In the assay with Mcm2 alone, adding Cdc23 (Mcm10 ortholog) decreases Mcm2 phosphorylation. In this scenario I would feel pretty comfortable saying Cdc23 regulates Hsk1's kinase activity (although I didn't annotate it because I'm not sure it's physiologically relevant). With the whole MCM complex present, it seems less clear that the effect of additional proteins - Cdc23, Cdt1 or Cdc18 tested - is via regulating kinase activity, as opposed to perhaps regulating substrate\ accessibility, complex conformation, etc. Am I just being over-cautious here?

https://curation.pombase.org/pombe/curs/a67f3e355eafcf32/ro/

mah11 commented 7 years ago

A couple of 'phosphorylation' BP examples ...

We have a handful of annotations using phosphorylation of RNA polymerase II C-terminal domain (GO:0070816) or one of its even more specific descendants.

I think the example from PMID:10226032 could use the corresponding kinase activity term, GO:0008353, but for the PMID:19328067 we would have to either keep the BP annotation, add new MF terms, or lose information. The latter paper uses BP terms to capture residue specificity (Ser2 vs Ser5 in the CTD repeat), because there's only the generic CTD kinase term in MF.

PMID:10226032 https://curation.pombase.org/pombe/curs/615084ae39833661/ro/ PMID:19328067 https://curation.pombase.org/pombe/curs/a7c6118865e72409/ro/

Similarly, in PMID:15226425 there are phosphorylation phenotypes that don't identify a kinase; also, there's no motif-specific MF term available, so annotation uses histone H2A SQE motif phosphorylation (GO:1990853).

https://curation.pombase.org/pombe/curs/d2288b20e67eb62d/ro/

cmungall commented 7 years ago

On 15 Mar 2017, at 5:05, Midori Harris wrote:

I've found a few examples where we've used BP annotations for various reasons. I'm including read-only Canto links in case you want to see the details.

Three examples using GO:0001934 ! positive regulation of protein phosphorylation:

PMID:17189249 A tel1 (unknown MF) knockdown leads to decreased phosphorylation of Mrc1, but nothing in the paper indicates which kinase(s) do the job. For this paper we could probably get away with just the phenotype annotation if GO:0001934 went obsolete.

https://curation.pombase.org/pombe/curs/6d2d5b10bcbc0a55/ro/

Why not switch to GO:0045859 ! regulation of protein kinase activity or a pos/neg child? Even if you don't know which kinase, would these not be equivalent?

PMID:21680738 In an ins1 null, Hmg1 isn't phosphorylated as it normally is, but the kinase is not shown; there's no evidence that Ins1 (or the human ortholog Insig) has kinase activity. Could this be captured in LEGO with "some kinase activity enabled by some (unidentified) gene product"? If so, how would it be translated into GAF or GPAD?

(There's evidence in PMID:19041767, which we have not yet curated, that Sty1 is involved but may act upstream rather than directly on Hmg1.)

https://curation.pombase.org/pombe/curs/6d2d5b10bcbc0a55/ro/

PMID:12604790 Dfp1-Hsk1 (ortholog of Dbf4p/Cdc7p) phosphorylates Mcm2. In the paper it's assayed in vitro with either Mcm2 alone or with the MCM hexamer. In the assay with Mcm2 alone, adding Cdc23 (Mcm10 ortholog) decreases Mcm2 phosphorylation. In this scenario I would feel pretty comfortable saying Cdc23 regulates Hsk1's kinase activity (although I didn't annotate it because I'm not sure it's physiologically relevant). With the whole MCM complex present, it seems less clear that the effect of additional proteins - Cdc23, Cdt1 or Cdc18 tested - is via regulating kinase activity, as opposed to perhaps regulating substrate\ accessibility, complex conformation, etc. Am I just being over-cautious here?

https://curation.pombase.org/pombe/curs/a67f3e355eafcf32/ro/

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/geneontology/go-ontology/issues/12859#issuecomment-286722329

pgaudet commented 7 years ago

These are again cases of 'upstream or involved in' phosphorylation.

I thought we had agreed that the annotation to the regulation of an activity implied a direct role.

One suggestion for those types of results would be to annotate to something more vague, like 'intracellular signaling'.

My two cents, Pascale

On Thu, Mar 16, 2017 at 7:23 AM, Chris Mungall notifications@github.com wrote:

On 15 Mar 2017, at 5:05, Midori Harris wrote:

I've found a few examples where we've used BP annotations for various reasons. I'm including read-only Canto links in case you want to see the details.

Three examples using GO:0001934 ! positive regulation of protein phosphorylation:

PMID:17189249 A tel1 (unknown MF) knockdown leads to decreased phosphorylation of Mrc1, but nothing in the paper indicates which kinase(s) do the job. For this paper we could probably get away with just the phenotype annotation if GO:0001934 went obsolete.

https://curation.pombase.org/pombe/curs/6d2d5b10bcbc0a55/ro/

Why not switch to GO:0045859 ! regulation of protein kinase activity or a pos/neg child? Even if you don't know which kinase, would these not be equivalent?

PMID:21680738 In an ins1 null, Hmg1 isn't phosphorylated as it normally is, but the kinase is not shown; there's no evidence that Ins1 (or the human ortholog Insig) has kinase activity. Could this be captured in LEGO with "some kinase activity enabled by some (unidentified) gene product"? If so, how would it be translated into GAF or GPAD?

(There's evidence in PMID:19041767, which we have not yet curated, that Sty1 is involved but may act upstream rather than directly on Hmg1.)

https://curation.pombase.org/pombe/curs/6d2d5b10bcbc0a55/ro/

PMID:12604790 Dfp1-Hsk1 (ortholog of Dbf4p/Cdc7p) phosphorylates Mcm2. In the paper it's assayed in vitro with either Mcm2 alone or with the MCM hexamer. In the assay with Mcm2 alone, adding Cdc23 (Mcm10 ortholog) decreases Mcm2 phosphorylation. In this scenario I would feel pretty comfortable saying Cdc23 regulates Hsk1's kinase activity (although I didn't annotate it because I'm not sure it's physiologically relevant). With the whole MCM complex present, it seems less clear that the effect of additional proteins - Cdc23, Cdt1 or Cdc18 tested - is via regulating kinase activity, as opposed to perhaps regulating substrate\ accessibility, complex conformation, etc. Am I just being over-cautious here?

https://curation.pombase.org/pombe/curs/a67f3e355eafcf32/ro/

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/geneontology/go-ontology/issues/12859#issuecomment- 286722329

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/geneontology/go-ontology/issues/12859#issuecomment-286968582, or mute the thread https://github.com/notifications/unsubscribe-auth/AEj7UEqITUHDH5fDlhjZESGafX6DqEEvks5rmNVrgaJpZM4LHhSz .

ValWood commented 7 years ago

You can use a BP "regulation of activity" for an indirect role.

MF " GO:0098772 molecular function regulator" and descendants are the ones which are restricted to a direct role.

Although the comment is only present on the enzyme regulator term http://www.ebi.ac.uk/QuickGO/GTerm?id=GO:0030234

GO:0030234 is reserved for cases when the regulator directly interacts with the enzyme. When regulation of enzyme activity is achieved without enzyme binding, or when the mechanism of regulation is unknown, instead annotate to 'regulation of catalytic activity ; GO:0050790'.

(....I know, its very confusing! I think maybe this comment should be added to EVERY MF regulator term)

ValWood commented 7 years ago

Hmm, Chris's suggestion.

We are using the phosphorylated protein in the extension, not the kinase. Tel2 regulates the phosphorylation of Mrc1.

If we used "regulation of protein kinase activity" Mrc1 would be interpreted as the kinase, not the substrate....

RLovering commented 7 years ago

I agree with Val that 'you can use a BP "regulation of activity" for an indirect role'.

mah11 commented 7 years ago

PMID:17189249 A tel2 (unknown MF) knockdown leads to decreased phosphorylation of Mrc1 ..

https://curation.pombase.org/pombe/curs/6d2d5b10bcbc0a55/ro/

Why not switch to GO:0045859 ! regulation of protein kinase activity or a pos/neg child? Even if you don't know which kinase, would these not be equivalent?

Strictly speaking, I don't think the data in this case distinguish among regulating kinase activity, regulating phosphatase activity (in the opposite direction) or even something like altering substrate accessibility. We know only that one particular protein ends up phosphorylated to a lower extent in a mutant than in wild type.

(Caveat: I didn't do the original curation for this paper, and only skimmed through it quickly for my previous comments.)

mah11 commented 7 years ago

These are again cases of 'upstream or involved in' phosphorylation. I thought we had agreed that the annotation to the regulation of an activity implied a direct role.

Well, yes, but isn't that a separate issue from the question of whether a one-step BP term is the correct thing for the function to be upstream of or involved in?

To address the direct vs. indirect issue, PomBase curators would be inclined to remove 'indirect' BP annotations entirely, since we have the information captured in phenotype annotations. It does look like the tel2/mrc1 case might be a good one to handle this way, since the paper doesn't provide direct evidence of altered kinase (or phosphatase) activity.

RLovering commented 7 years ago

On a general principal I also agree with Val that not all proteins that are phosphorylated are kinases. Also not all occurrences of kinase phosphorylation leads to activation of a kinase. So if the phosphorylation event was shown to increase the kinase activity of the phosphorylated protein then you could use "regulation of protein kinase activity" and list Mrc1 in the AE field. However, many expts transfect a kinase cDNA (or knockdown a kinase) in a cell and then look at the phosphorylation state of the predicted target. The problem is that because of signaling cascades it is possible that the transfected/knockdown kinase might not phosphorylate the phosphorylated protein directly. So regulation of phosphorylation is the conservative annotation to make. Just noticed Midori's comments and obviously there is also the phosphatase aspect to consider too.

With human data these experiments imply a role of a kinase in specific signaling pathways so I would not want to see these annotations removed. In theory could also discuss what other supporting evidence would be required in order to create an annotation that considers other evidence and enables a curator to make the 'direct' v 'upstream' role.

ValWood commented 7 years ago

We could lose that annotation. I guess we felt obliged to capture in GO since the paper title is: Tel2 is required for activation of the Mrc1-mediated replication checkpoint. But its probably enough for now that both are annotated to GO:0033314 - mitotic DNA replication checkpoint

ValWood commented 7 years ago

If we use qualifiers we need to be able to distinguish between "causally upstream and within a pathway/process" (as in the tel2 example, we would use).

and "causally upstream" in a different process ("indirect", in the way a biologist uses the word), we wouldn't use.

These seem to be lumped....

ValWood commented 7 years ago

I'm not sure we would make the "regulation of phosphorylation" annotation here using our stricter criteria. This list cam from our remaining phosphorylation examples (we don't have many annotations left to these single step process terms, because we don't make them if we have better info).

The other examples are probably more interesting.....

krchristie commented 7 years ago

As discussed at Ontology Editors call 4/7/17, we should also consider if removing terms like this from BP would have negative impact on enrichment tools that require selecting one aspect of GO at a time, such as these:

Enrichment at GOC site (either from start page or enrichment specific page) http://www.geneontology.org/ http://www.geneontology.org/page/go-enrichment-analysis

Generic GO Term Finder (at Princeton http://go.princeton.edu/cgi-bin/GOTermFinder

SGD’s implementation of GO Term finder http://www.yeastgenome.org/cgi-bin/GO/goTermFinder.pl

ValWood commented 2 years ago

This is just a long discussion, with no specific action items as far as I can tell implementation is ongoing...