Define patterns for defining receptor activity as a compound function

dosumis commented 8 years ago

We need to settle on a pattern for defining receptor activities. This may provide a good starting point for patterns for other compound functions. This ticket discusses possible patterns.

Like all MFs, compound MFs have a single agent: a gene product or a complex consisting of multiple gene products. This is both an opportunity and a challenge. Because the agents in question are a target for annotation we can safely infer annotations over has_part for compound MFs *. But it also means that truly safe patterns for defining receptor activities need to be careful about co-reference issues. In the case of receptors, the binding of a ligand activates some other component of the same MF (which the same GP enables). A truly safe pattern would prevent inferred classification when the regulated function is not part of the same compound function. We have limited tricks for achieving this in OWL, but this seems to work for specific cases:

glutamate-gated calcium channel activity EquivalentTo: molecular_function that has_part some ( (binding that has_input some glutamate) and ( directly_activates some ( 'calcium channel activity' that part_of some self)))

Test implementation

The major issue with this is that ELK does not currently support self restrictions. However, it is within the highly scalable EL profile of OWL. Another drawback is that, even with an inverse object property link between part_of and has_part + DL reasoning (via HermiT), it is not sufficient to infer a has_part relationship between the receptor activity and its effector function. So, to fulfil its role inference this definition needs to be extended:

molecular_function that has_part some (
  (binding that has_input some glutamate) and (
    directly_activates some (
      'calcium channel activity' that part_of some self)))
and (has_part some 'calcium channel activity')

(# clause could also be added a hidden GCI)

With this in place, if could certainly be easily handled by a pattern-based system.

The general case is more problematic.

receptor activity EquivalentTo: molecular_function that has_part some ( (binding that directly_activates some ( molecular_function that part_of some self)))

There are cases of direct regulation following this pattern where the compound function is not considered receptor activity This is not simply down to the chemical nature of what binds: Calcium is considered a ligand in some cases following this pattern (calcium-gated calcium channel activity) but not in others, such as the many enzymes activated by calcium. This could probably be most easily fixed with a ligand role.

A much simpler approach would be to define a has_ligand relation. This embeds the role in the relation.

has_ligand domain: 'receptor activity' range: 'chemical entity' Expands to: has_part some (binding that has_input some (?y that directly_activates some ( molecular_function that part_of some self)))

glutamate-gated calcium channel activity EquivalentTo: 'receptor activity' that (has_ligand some glutamate) and (has_part some 'calcium channel activity')

This has the advantage of being within EL and within OBO.

The disadvantages of this are that: (a) without the expansion, the has_part X binding is not captured (a GCI approach for capturing this would be more compact than expansion here). (b) It doesn't capture the regulatory relationship between binding and activation. This might be useful in grouping cases where a mechanism is shared between cases where the binding activator is considered a ligand and cases where it is not.

* (It also opens the possibility of defining using a LEGO-based exemplar, but I'll ignore this here)

dosumis commented 8 years ago

@cmungall @thomaspd - comments please.

cmungall commented 8 years ago

On 4 Jan 2016, at 6:48, David Osumi-Sutherland wrote:

glutamate-gated calcium channel activity EquivalentTo: molecular_function that has_part some ( (binding that has_input some glutamate) and ( directly_activates some ( 'calcium channel activity' that part_of some self)))

quick response for now: not sure I understand the intent of the self. The self here is the calcium channel activity (and not the GGCA). Indeed the clause is superfluous if part-of is declared (globally or locally) reflexive.

(it's possible I need to go back and look at the OWL spec again...)

cmungall commented 8 years ago

On 4 Jan 2016, at 6:48, David Osumi-Sutherland wrote:

glutamate-gated calcium channel activity EquivalentTo: molecular_function that has_part some ( (binding that has_input some glutamate) and ( directly_activates some ( 'calcium channel activity' that part_of some self)))

Don't think you'll like this approach:

what if we defined these by a prototype model. This gets around any co-reference issue. Of course it brings in other semantic issues, that without a theory of prototypes, the strict instance-level representation is much weaker inferentially.

For practical purposes, we could derive class axioms from the instance graph. For any point on a triangle such as this one, there are two class-level axioms that could be derived, depending on direction followed. The two could be combined; the results would still be weaker and suffer the co-reference issue.

dosumis commented 8 years ago

Reading my pattern suggestion above, I'm happy to ditch as too complicated (& you're probably right about self restriction).

I'm convinced now that we should at least explore the prototype approach - building prototypes in LEGO - with the aim that they can be used as templates. I can't see how else we're going to keep LEGO annotation sufficiently consistent with the ontology and with itself.

We still need some formalization sufficient for auto-classification where we can't rely on some external source (e.g. EC).

I'm curious how this could work:

For practical purposes, we could derive class axioms from the instance graph. For any point on a triangle such as this one, there are two class-level axioms that could be derived, depending on direction followed. The two could be combined; the results would still be weaker and suffer the co-reference issue.

But maybe we just need some form of combined design pattern & template specification.

Here's a basic graph template for a receptor:

?effector_activity part_of ?receptor_activity, ?ligand_binding_activity part_of ?receptor_activity, ?ligand_binding_activity directly_activates ?effector_activity, ?ligand_binding_activity has_input ?ligand,

Here's a simple design pattern for classifying receptors:

name: receptor_effector vars: . ligand: 'chemical entity' . effector_activity: molecular_function

relations: . has_ligand . has_template

classes: receptor activity

EquivalentTo:
. text: "'receptor activity' that (has_ligand some %s) and (has_effector some %s") . vars: . - ligand . - effector

If we can fold the template into the design pattern, then the same variables could bind to both.

Maybe this binding can also serve to define the pair of relations. One potentially nice thing about this is that we could choose to hide the detailed model even in LEGO - but expand for reasoning.

This might work best in combination with treating some basic MF types as primitives. e.g. don't try to define a generic receptor activity (Looking at more examples, I'm starting to think this is surprisingly hard: there are many things activated by binding that are not receptors; not everything that activates a receptor by binding it is a ligand.)

CC @ukemi @vanaukenk

cmungall commented 8 years ago

I think this makes sense

We already have a ticket open coordinating graphically edited lego template/prototype models with ODPs: https://github.com/geneontology/noctua/issues/254

dosumis commented 8 years ago

I suspect there are two cases here: For most BPs, the template pattern will follow the ontology pattern For compound MFs (as above) the two patterns will be distinct and linked, sharing variable slots. It would be nice to keep everything in one place. I can see two (not necessarily mutually exclusive) options:

Extend DOS-DPs to allow specification of instance level templates. In this case, we only have one set of variables - which set the range for slots in both types of pattern
Include a slot for a link to a template built in noctua.

dosumis commented 8 years ago

Aims:

Design pattern must group receptors by effector function and ligand type.

Template must => annotation to binding and effector functions. It should also include internal regulation links between binding and effector nodes to allow for regulation via regulation of ligand binding.

DOSDP:

name: receptor_activity

relations: hasligand: RO hascomponent: RO regulates: RO_

classes: binding: GO: molecular_function: receptor activity:

vars: $effector: molecular_function $ligand: chemical entity

equivalent_to: $effector that has_ligand some $ligand

Notes:

In-line with the current ontology, this keeps the effector activity as the genus. This differs from the pattern, where it is a component.
We need a has_ligand relation. has_input is just too broad to be safe for classification by ligand, covering effector substrates and other small molecules that bind.

Template

enables o has_component -> enables

Notes:

we can't use the same relation for activation of effector function as we use to define activator functions as this will => incorrect inference. e.g. all RTK activities will be classified as subclasses of kinase activator activity. It may just be sufficient to use the general relation 'regulates' here rather than adding a new relation. TBD
has_component + property chain w/ enables => annotation to component activities. But... not sufficient for classification... Maybe we need the design pattern to use has_component too... can't think of a way to chain this and GCIs (has_component X subclassof X) would be weird and hard to maintain.

dosumis commented 7 years ago

From discussion with Chris:

use has_component in ontology. Infer annotation to component functions at annotation time both for both LEGO and classical GO annotation except where contributes_to is used as qualifier. Property chain component_of o enable_by -> enabled_by
Use positively regulates in place of internally activates. May revisit this later if causes confusion.

dosumis commented 7 years ago

From discussion with Paul:

We maybe able to generalise this up to signal transducer activity - allow activation by binding - possibly also modification. Probably better to define the individual patterns first though.

geneontology / molecular_function_refactoring