geneontology / annotation_extensions

Documentation, tickets & usage reports for annotation extension relations.
2 stars 2 forks source link

Can we use "directly inhibits" and "directly activates" in extensions? #70

Closed dosumis closed 7 years ago

dosumis commented 8 years ago

From @ValWood on September 13, 2016 15:22

Can we use "directly inhibits" and "directly activates" in extensions to describe the relationship between molecular functions?. I know it has been discussed and we can use them in LEGO. I'm not sure if they have been "ratified" for use as annotation extensions in GAFS...

Thanks

VAl

Copied from original issue: geneontology/go-ontology#12657

dosumis commented 8 years ago

From @paolaroncaglia on September 14, 2016 7:29

@cmungall , @dosumis Could you comment on Val's question please? Thanks.

dosumis commented 8 years ago

Hi Val,

This is documented in the annotation extensions repo doc folder - auto-generated from gorel.owl

regulates, directly_regulates + +ve & -ve versions of each are currently allowed, but directly_activates and directly_inhibits are not (yet).

I can easily add them, but would like to improve documentation a bit too.

The definitions for these in RO suck - e.g. see directly activates

For now, I can add a GO usage statement (in gorel.owl). Does this sound reasonable:

directly_activates: "Use this to relate two molecular functions, A & B, where function A positively regulates function B via direct binding or modification of the gene product or complex carrying out function B."

Val: does this fit your intention? @cmungall, @ukemi - are you happy with this? I think it is OK to be narrower than the RO def as it is meant as guidance for GO and so doesn't cover all possible uses. If both you're all happy, I'll go ahead and add. (BTW - does CANTO consume gorel.owl for relations? If not it probably should).

Chris: It would be nice to tighten up the RO def too. I've been sketching common logic defs but struggling.

dosumis commented 8 years ago

From @mah11 on September 14, 2016 8:24

just noticed that this should go on the annotation extensions tracker - https://github.com/geneontology/annotation_extensions/issues

dosumis commented 8 years ago

just noticed that this should go on the annotation extensions tracker -

moved

CC @mah11 @ValWood

ValWood commented 8 years ago

This would work for me.

Basically I need it so that I can curate my networks in Canto and they are LEGO ready...

dosumis commented 8 years ago

This would work for me.

Excellent. I'll wait for Chris & David's input before implementing. I think the tightened definitions are badly needed for LEGO too (see confusion over use in regulation of receptor activity by regulating ligand levels).

ValWood commented 8 years ago

(BTW - does CANTO consume gorel.owl for relations? If not it probably should).

I think so but I'm not sure... @kimrutherford

kimrutherford commented 8 years ago

(BTW - does CANTO consume gorel.owl for relations? If not it probably should)

We don't use gorel in Canto at the moment.

dosumis commented 8 years ago

That makes 3 mechanisms for controlling relation usage in annotation: protein2GO (via specification in gorel + P2G/QuickGO/GOA code); Canto (via some internal mechanism); LEGO (via some internal mechanism). Unifying these would be ideal, but probably hard. In the absence of that - we should at least all be using the same definitions & IDs (shorthand and RO versions). We've talked about moving some gorel content (usage statements) into RO. For CANTO we probably need to move shorthand IDs too, but this scares me.

paolaroncaglia commented 8 years ago

Hi @dosumis , Thanks for moving this request here. As you've already commented and offered to take action, I think this ticket should be assigned to you as a statement and record of your good work here. But feel free to re-assign. Cheers!

ValWood commented 8 years ago

Hi David,

What are "shorthand ID's?"

What do you mean by "controlling relation usage" (we use domain and range specified by GO, but we also have further controls which only allow the relations to be used with terms where they make biological sense. Chris is aware of these if anyone want to use them....)

Val

dosumis commented 8 years ago

shorthand IDs are the way that curators refer to relations in annotation extensions (see example below). Once upon a time these were the only IDs for relations in GO, but they've had RO IDs and separate labels for many years. The old ways just got fossillised in annotation extentions.

Here's and example:

From causally_upstream_of doc

The basic issue is: where to the usage statements live so that everyone (users of noctua, P2G or Canto) can see them? If Canto needs to look up annotations via shorthand, then you may need to use gorel.(obo/owl). This file also gives access to GO/AE specific relations not (yet?) in RO.

Probably best to leave discussion of controlling relation usage (allowing the relations to be used with terms where they make biological sense) to another ticket.

ukemi commented 8 years ago

This seems ok, but you might want to run it by Paul too.

dosumis commented 8 years ago

@thomaspd, how's this for a usage guidance statement for directly_activates:

directly_activates: "Use this to relate two molecular functions, A & B, where function A positively regulates function B via direct binding or modification of the gene product or complex carrying out function B."

(directly inhibits would be the same but with negative in place of positive regulation)

?

cmungall commented 8 years ago

+1

Agree we should get usage notes etc into RO centrally to simplify things

ValWood commented 8 years ago

Great. I need them as I'm hoping that when I add the directly activates and directly inhibits relations this model can be auto-generated in LEGO (the unannotated version is the current network from the GO annotation, its fine you know the directionality, but you have no idea whether the MF is activatory or inhibitory with the current annotation.

g2_m

ValWood commented 8 years ago

Key blue circle = kinase Blue circle /red bar = inhibitory kinase Blue circle green arror = activatory kinase yellow circle= phosphatase yellow circle green arrow= activatory phosphatase +ve or -ve on 'edge' affect on G2/M transition solid circles around gene products 'essential' broken circles around gene products 'non-essential'

I'm trying to fill the gaps in this model.......

dosumis commented 8 years ago

@thomaspd wrote: "My one reservation is that I’m not yet 100% convinced that we don’t want to use directly_activates when a gene product either increases or decreases the concentration of a small molecule that acts as a regulator of a downstream gene product activity. Examples include the transporter activity that regulates acetylcholine levels in the annotation exercise. This is direct in the sense that there are no intervening gene product activities. "

dosumis commented 8 years ago

If a transporter secretes vasopressin from the pituitary gland, does that transporter directly activate V2 receptor activity in the kidney? What about a factor that regulates vasopressin levels in the blood by degrading it or transporting it into some other cell type. Does that directly inhibit V2 receptor activity in the kidney?

The only difference between this and the acetylcholine receptor activity example is scale. In both cases the biology only makes sense with reference to concentration of some small molecule in a compartment (synaptic cleft, blood). I'd much rather Lego models specified the small molecule and the compartment. Not doing this will preclude important types of query/grouping: e.g. v. Useful to be able to find all factors regulating vasopressin levels in the bloodstream, and to be able to separate these from non-ligand regulators of Vassopresin receptor activity. We can always have a more general regulates relation that bridges.

dosumis commented 8 years ago

Another example: the action of some gene product directly changes blood pressure, e.g. by reducing blood volume via transporter function in kidney. Does this gene product directly activate mechanoreceptors in baroreceptor neurons in blood vessels?

ukemi commented 8 years ago

We'd also have to be very careful about how we defined terms like 'receptor activator activity'. Clearly the transporter is not one even if in this model we say it directly activates the receptor. I don't think any biologist would consider the transporter an activator.

dosumis commented 8 years ago

We'd also have to be very careful about how we defined terms like 'receptor activator activity'. Clearly the transporter is not one even if in this model we say it directly activates the receptor. I don't think any biologist would consider the transporter an activator.

Yep. These relations are used for logical defs of activator activity terms in the ontology. So if you say it directly activates the receptor in the model, you will get an inferred annotation to 'receptor activator activity'.

dosumis commented 8 years ago

Another example: The activity of an ligand-gated ion channel changes the potential difference across a membrane. That change in potential difference activates a voltage-gated ion channel, does the first activity directly regulate the second?

dosumis commented 8 years ago

One other issue we haven't covered here: I think we need to more clearly distinguish directly_positively_regulates from directly_activates. I think biologists would reserve 'activates' for cases like a ligand binding to a receptor and activating (initiating) its effector activity. They wouldn't use it for some process which modifies (e.g. phosphorylates) a gene product (e.g. a receptor) and so increases its activity if an when it is activated (e.g. by a ligand). Perhaps activates should be reserved for the former.

ValWood commented 8 years ago

I don't really understand this thread.

As I thought I understood it directly_activates and directly_inhibits are already the relationships used in LEGO models to describe the type of regulation illustrated in my network diagram above ? I just want to be able to use these in 'normal' annotation.

Isn't this the case? So are you considering not using these in LEGO any longer?

Could we discuss this on the next call?

ValWood commented 8 years ago

So if you say it directly activates the receptor in the model, you will get an inferred annotation to 'receptor activator activity'.

I would expect that.

If I annotated a protein phosphatase as a 'direct activator" of a protein kinase (cdc25 activating cdc2 above), it would by inference also be annotated to "protein kinase activator"

Is this different? Do you think that it shouldn't be?

ValWood commented 8 years ago

Although this is something I have been wondering about......

Cdc25 dephosphorylates cdc2 (cdk1) resulting in increased Cdk1 activity. I think here the mechanism is direct, the activity of cdk1 increases when phosphorylated.

Chk1 phosphorylates cdc25 resulting in decreased activity. However this does not (as far as I can tell) appear to be due to any effect on the activity of cdc25. The phosphorylation results in the 14-3-3 phospho-protein binding proteins (rad24 &25 above) being able to bind cdc25 and sequester it in the cytoplasm away from Cdc2.... So here, the phosphorylation isn't "directly inhibiting" cdc25.....is this what you mean?

ValWood commented 8 years ago

If it isn't clear what we want to do, an example here: eg

We don't want to say that the cdc kinase cdc2 kinase "has substrate" dis2 "involved in" "negative regulation of protein phosphatase activity"

instead we want to say cdc2 directly_inhibits dis2 "involved in "mitotic spindle checkpoint silencing" (the actual process regulated rather than the activity regulated, which isn't as meaningful, because you don't know if the phosphates is regulating the process negatively or positively)

ValWood commented 8 years ago

So, in case its not clear, I'm trying to distinguish between

  1. phosphorylation events which activate a phosphatase or a kinase
  2. phosphorylation events that inhibit a phosphatase of a kinase and
  3. dephosphorylation events which activate a phosphatase or a kinase
  4. dephosphorylation events that inhibit a phosphatase of a kinase

without needing to do

cdc2 has_substrate dis2 involved in" "negative regulation of protein phosphatase activity", "involved in "mitotic spindle checkpoint silencing"

It could be a different relation, but I can't think what...

dosumis commented 8 years ago

phosphorylation events which activate a phosphatase or a kinase phosphorylation events that inhibit a phosphatase of a kinase and dephosphorylation events which activate a phosphatase or a kinase dephosphorylation events that inhibit a phosphatase of a kinase

Yep - I think direct activation and inhibition are correct here. I'll add them to legal relations for extensions.

I've been working on definitions for these - they're pretty useless right now. Hope to update later this week after further discussion.

ValWood commented 7 years ago

PomBase have already started to use "directly_inhibits" and "directly_activates". We can map these up to "has_substrate", but can these be approved for traditional GO annotation (I thought they could be used in noctua? but maybe not? are the lists of available annotation extension relations for traditional GO and noctua completely aligned?)

This is how we use them:

2a28ab16-adce-11e6-990a-1538912110a0 2

ValWood commented 7 years ago

has_substrate clp1 involved in negative regulation of serine/threonine phosphatase activity,involved in negative regulation of mitosis (compound process)

would instead be directly_inhibits clp1 involved in negative regulation of mitosis (much clearer for users)

This is because "processes" like "negative regulation of serine/threonine phosphatase activity" are really about the affect on a gene products activity, not the affect on the process being regulated.

You would only use this relation if you absolutely knew, from the experiment, whether the phosphorylation was activatory or inhibitory.

ValWood commented 7 years ago

Above @thomaspd wrote: "My one reservation is that I’m not yet 100% convinced that we don’t want to use directly_activates when a gene product either increases or decreases the concentration of a small molecule that acts as a regulator of a downstream gene product activity. Examples include the transporter activity that regulates acetylcholine levels in the annotation exercise. This is direct in the sense that there are no intervening gene product activities. "

Is this still an issue? In our use case we would only use "directly_inhibits" as a relationship to a gene product with a specific activity, in the sense that there are no intervening gene products.

We would use a different relation, is we wanted to capture that the activity being annotated acted on another gene product, and the affect was indirect, or we did not know. I'm not sure yet which relation is appropriate for this (acts upstream of? ), because for now we have only represented these causal relations on the process terms like so:

dis2

Ideally, I'd like to connect everything possible to MF terms, rather than process as (to make it LEGO compliant).

dosumis commented 7 years ago

Here're all the terms we have right now:

image

Following discussion in LA: given the amount of confusion and discussion there was around distinction between directly_activates and directly_positively_regulates, I think we should collapse the distinction. As there are some circumstances where activation/inhibition just sounds wrong, I suggest we stick with the more neutral directly_positively/negatively regulates in all all situations in GO (ontology, LEGO, AE). We can make directly_activates/inhibits into narrow synonyms.

These should already be legal (although I've just noticed some weirdness in protein2GO so will check again now.)

@ValWood - would this work for you? I would like to get final agreement from Eds + @vanaukenk first. I'll put it on the editor meeting agenda and try to get agreement ASAP.

dosumis commented 7 years ago

@thomaspd wrote: "My one reservation is that I’m not yet 100% convinced that we don’t want to use directly_activates when a gene product either increases or decreases the concentration of a small molecule that acts as a regulator of a downstream gene product activity."

I'm strongly against this usage - see examples upthread. I think regulates is fine - or something that fills in the gap of what small molecule or change in quality is doing the regulating.

ValWood commented 7 years ago

Yes ;)

mah11 commented 7 years ago

given the amount of confusion and discussion there was around distinction between directly_activates and directly_positively_regulates, I think we should collapse the distinction.

That sounds fine to me. for our purposes, 'direct' is the important part, and we have far less use for any distinction between activates and any other sense of positively_regulates (even if a distinction could be clarified and documented).

"My one reservation is that I’m not yet 100% convinced that we don’t want to use directly_activates when a gene product either increases or decreases the concentration of a small molecule that acts as a regulator of a downstream gene product activity." - @thomaspd

I'm strongly against this usage - see examples upthread. I think regulates is fine - or something that fills in the gap of what small molecule or change in quality is doing the regulating. - @dosumis

I agree wholeheartedly with @dosumis (and with a slew of other annotators, iirc) - I am 100% convinced yada yada. Altering the amount of available small molecules is regulation, but it is not direct regulation of another gene product's activity.

I would reserve the directly_regulates trio for activities that alter the catalytic properties of the gene product that executes the downstream activity.

ValWood commented 7 years ago

given the amount of confusion and discussion there was around distinction between directly_activates and directly_positively_regulates, I think we should collapse the distinction.

I agree too. I think that was the decision at the GOC meeting. I did have a classical example cdc2-cyclin complex is active (at low levels) during S-phase. The directly_activates and directly_inhibits are really positively regulates if we have both options case because they are really only 'directly positively regulating' an already active kinase. However, nobody who works on these would be confused by us calling this "directly activates" or "directly inhibits" in this scenario, so it is probably safe to lump. At this level we are really getting into enabling flux modelling and recording activity levels...that can probably wait for a different decade...

I'm more immediately interested in being able to automatically include this "arrowhead notation" in our automatically generated network diagrams: automate

dosumis commented 7 years ago

Apologies for the delay on this. Fixed some time ago, but apparently still held up in Protein2GO by pipeline issues. Please re-open ticket if you try to use these relations and hit problems.

ValWood commented 7 years ago

No hurry. Using them but won't be doing a submission until our next update which migh be a while. I'm not sure what internal checks we have on this @kimrutherford note if there are any problems....

kimrutherford commented 7 years ago

I'm not sure what internal checks we have on this @kimrutherford note if there are any problems....

Should be fine. We will need to add the relations to the Canto extensions configuration. And as a double check when loading Chado, we have a configuration file with a list of which extension relations are allowed for each ontology.

ValWood commented 7 years ago

Can "directly activates" and "directly inhibits" be the primary names. They will be much neater in this scenario:

swap

we can switch the display labels but .....

dosumis commented 7 years ago

"directly activates" and "directly inhibits" be the primary names.

I'd rather not. Agree they look clearer here, but there are cases of more subtle regulation where using activates/inhibits is confusing.

ValWood commented 7 years ago

ok, we'll change display label...