Closed pombase-admin closed 9 years ago
I don't think it's feasible to impose a requirement that any activity phenotype term must have an assayed_using extension. They're certainly a lot more informative with a substrate identified, but some kinase or similar phenotypes are assayed with exogenous substrates, bulk substrates, or other things for which we won't have a "DB:ID" identifier available to put in an assayed_using extension.
It might be possible for phenotypes that mention protein or RNA, such as level and localization phenotypes, but even for those I'm not sure we can always supply an identifier to go in an extension (e.g. localization of exogenous protein/construct; level of bulk poly(A)+ RNA; etc.).
Original comment by: mah11
I only mean the branches where we know an assayed using can always be speicified
i.e. cellular protein localization (of what) normal/abnormal protein binding (of what to what) abnormal/reduced/increasesed/abolished protein modifier activity (protein kinase/methylase/actylase etc)
I think this would be a useful check...
Original comment by: ValWood
cellular protein localization (of what)
might be ok; anyway, we could see what comes up in logs
normal/abnormal protein binding (of what to what)
these are already checked for two assayed_using extensions
abnormal/reduced/increased/abolished protein modifier activity (protein kinase/methylase/actylase etc)
these are the ones that are sometimes assayed using substrates we wouldn't be able to identify by DB:ID -- I have curated some kinase phenotypes that used something exogenous (bovine casein in one case I seem to recall)
Original comment by: mah11
but in all of those cases the kinase being assayed is the one you are annotating (I think). If you don't know which kinase you are talking about, what's the point of the experiment ;)....I agree you do not always know the substrate (but that's GO/modification territory). Or am I confused?
We could try it and see....Can always go whoa, too many to fix. But better fixing early and preventing new ones...
Original comment by: ValWood
I've thought all along that it would be substrates you would put in an extension on a FYPO activity phenotype term (except in the impossible cases).
Original comment by: mah11
I've got a bad feeling about this, but for Kim to implement checks, these would be the ancestor terms:
should be mostly ok: FYPO:0002333 ! protein localization phenotype
riskier: FYPO:0000654 ! catalytic activity phenotype
There will be TONS of false positives but the only alternative to using FYPO:0000654 would be a laundry list of specific MF phenotype terms that you want to check for.
Original comment by: mah11
Are you sure?
If I was annotation a protein regulator gene A, I might use "reduced protein kinase activity" and I would put the kinase whose activity was reduced in assayed_using.
so cdc13 mutant might have "reduced protein kinase activity" assayed_using cdc2
I haven't put the kinase substrate here. I have put the kinase whose activity was assayed.
However, likely, if I made this phenotype annotation for cdc2, I would not have bothered with an assayed_using (because it is assayed_using itself), but I think we should be explicit, because potentially an upstream kinase can reduce the activity of a downstream kinase.
I would not put the substrates (phosphorylation targets) in here. I would put these on the GO activity OR Go phosphorylation process terms, if there was good data.
val
You must know which kinase, I put
Original comment by: ValWood
Not ALL catalytic activity phenotypes. Only the ones where it should always be possible to specify the substrate. I will start the list. We can do it in small steps.
Original comment by: ValWood
Oh dear. Then how do you specify the substrate? And especially, if you know both the assayed kinase and the substrate, how do you specify which is which?
I would definitely include the substrate if known and id'able, because a mutation could have different effects on phosphorylation of different substrates.
Original comment by: mah11
yeah, it's just that there's no common parent for "the ones where it should always be possible to specify the substrate"
Original comment by: mah11
or maybe better phrased as "if you see an assayed_using extension on a kinase phenotype term, how do you know whether it's the kinase or the substrate?" we haven't tried to distinguish, and we haven't codified an SOP. what a mess.
Original comment by: mah11
It should be easy to disambiguate these.
If the mutant being annotated is a non kinase and the assayed_using is a kinase its the kinase.
If the mutant being annotated is a kinase, and the assayed_using is a non kinase it is the substrate
If the mutant being annotated is a kinase and the assayed using is a kinase, will need to check.
But on the gene pages we write abolished protein kinase activity affecting cdc2. I interpret this as meaning the kinase activity of cdc2 is abolished. Not that the kinase activity of gene x phosphorylating cdc2 is abolished?
It seems that we need to do something similar to protein binding
i.e abolished protein kinase activity affecting cdc2 phosphorylating genex
although for the substrates I have used decreased protein phosphorylation affecting fkh2 (see cdc2 page, there are lots of examples here and they seem to be consistent)
Original comment by: ValWood
You can't always add an extension to an enzyme annotation. For example, imagine they mutate a transcription factor and then measure "acid phosphatase activity" in whole cell extract. In this mutant, acid phosphatase activity might be decreased, which would lead to the attachment of the annotation "decreased phosphatase activity" to the transcription factor, with nothing specified in the extension field.
When I do add an extension to an enzyme phenotype, it always specifies the enzyme tested, NOT the substrate. I specify substrates using phenotypes that describe the "result" of the enzymes "job" (i.e. phosphorylation is the result of job kinase activity)
For example,
deletion of a kinase activator geneA abolishes kinase activity of geneB and leads to decreased phosphorylation of geneC
-> geneA-delta, abolished kinase activity, assayed_using(geneB) -> geneA-delta, decreased protein phosphorylation, assayed_using(geneC)
does this make sense to you?
Original comment by: Antonialock
yes, it makes sense, and I have done 'altered protein phosphorylation'-type annotations ... I just can't be sure I've never put a substrate in an extension on an activity phenotype term
but maybe there are few enough exceptions that we shouldn't worry about them
Original comment by: mah11
Also, you are right in that you lose a bit of specificity from in vitro experiments where, for example, they show that kinaseA phosphorylates substrateB on serines but mutant kinaseAA does not. In this case I would be wary of annotation kinaseAA to "abolished serine phosphorylation of substrateB" because the phosphorylation terms has the "during veg growth" attachment, and substrateB might still be phosphorylated on serines by kinases other than A outside of the test tube.
I'm not sure we should worry too much about these cases though because we need a system that is easily interpretable for users, and not excessively difficult for us to use.
Original comment by: Antonialock
Thinking about it, I might have put both the enzyme and the substrate in the extension before I realized it is ambiguous.
One check to do might be to check for extensions with >1 gene?
Original comment by: Antonialock
I'm closing this ticket. Will open a new one for the original request but read through this to check that I haven't missed anything.
Midori will propose relationships to differentiate between the assayed_enzyme and assayed_substrate, then we will document examples and let Kim/Mark know about the changes
Val
Original comment by: ValWood
Diff:
--- old
+++ new
@@ -1,4 +1,3 @@
-
I spotted that mik1 has
abolished protein kinase activity without an assayed_using.
Original comment by: ValWood
requires resubmission
Original comment by: ValWood
...but i don't think we would put "protein kinase" in this file as we would not always have an assayed_using for these annotations.
We can test it with a small number of terms. If it turns out to be impossible/useless so be it...
Original comment by: ValWood
...but i don't think we would put "protein kinase" in this file as we would not always have an assayed_using for these annotations.
which do you mean we wouldn't have? I think we won't always have a substrate with an ID, but if we don't know which kinase activity is changed, would we use a protein kinase phenotype term in the first place?
Original comment by: mah11
Under the new system it will. I think as you say above there might be lots of legacy ones which do not have any...
I'll start with the ones where I'm more confident that an extension would always be relevant, and take it from there (or abandon if it turns out to be totally useless....)
Original comment by: ValWood
I spotted that mik1 has abolished protein kinase activity without an assayed_using.
I suspect when we do this the target is always the gene we are annotation but we should be explicit.
Original comment by: ValWood