Closed ValWood closed 4 years ago
Also, for the term PHIPO:0001106 pathogen host protein-protein interaction present need to be able to specify 2 assayed using (i.e both interacting partners)
@ValWood do you have an example of the configuration for the has_substrate
extension? Based on the FYPO extension ontology, I can only see an example of assayed_substrate
:
[Typedef] id: assayed_substrate name: assayed_substrate def: "Relation between a catalytic activity phenotype and a substrate, such as a gene product, with which the phenotype was assayed." [PomBase:mah] comment: normal or abnormal protein kinase activity assayed_substrate PomBase:SPBC11B10.09 property_value: local_domain FYPO:0000654 ! catalytic activity phenotype is_a: assayed_using
There seems to be an example of the has_substrate
relation in the test configuration for Canto, but this hasn't been updated in years, so I'm not sure if it's accurate:
domain ID | subset relation | extension relation | range ID | Canto display text | Help text | cardinality | role |
---|---|---|---|---|---|---|---|
GO:0016023 | is_a | has_substrate | GENE | kinase substrate | 0,1 | user |
Hi, The first one is GO. That should be has_input I think (has substrate is our biologist friendly conversion for the gene pages)
The second one is PHIPO. You are correct, this should be 'assayed_using". At present, I can add one "assayed_using" but protein binding can have 2 interacting partners, so the user should be prompted for 2 gene products. (I worked around this by adding as free text)
I will try to track down the syntax for this in the FYPO config....
It's
FYPO:0000702 is_a assayed_using ProteinID affected proteins (add TWO, i.e. both binding partners) 2 user
I already have examples of assayed_using
, so I can work on that now, but I can't find any examples of the has_input
relation.
We currently have assayed_using
with two partners on PHIPO:0000132 (protein-protein interaction phenotype) and all its children (see below), so the fact it's not being applied to PHIPO:0001106 seems to be an oversight, resulting from the fact we always have to remember to apply extensions to both the single-species and pathogen-host branches.
With that in mind, would you want this new assayed_using
relation pushed up in the hierarchy to 'pathogen host protein-protein interaction phenotype' (PHIPO:0000164), so it also encompasses the terms under 'pathogen host protein-protein interaction absent' (PHIPO:0001107)?
domain ID | subset relation | extension relation | range ID | Canto display text | Help text | cardinality | role |
---|---|---|---|---|---|---|---|
PHIPO:0000132 | is_a | assayed_using | GeneID | affected proteins (add TWO, i.e. both binding partners) | 2 | user |
@ValWood decided to update this in the PomBase config instead (see here), then PHI-base can synchronise with these config files if needed.
Hi @mah11 could you add this to the pombe config file for GO MF GO:0038023 signaling receptor activity needs has_input v
GO:0038023 signaling receptor activity needs has_input
Hmmm. I had a quick look, to check the range should be, and I wonder if it's ideal to use GO:0038023 in the config. Is this meant to prompt for a gene (as proxy for its product)? It has a lot of descendants representing receptors for ligands that aren't gene products (simple compounds like GABA or glutamate; light; oddballs like "salty taste" ...).
Will it be a problem if the extension prompt appears for those, as it will if we just put GO:0038023 in the config?
Hmm, specifically here we are annotating
GO:0038187 | pattern recognition receptor activity so we could put it on this (which would only apply to things recognising pathogens, so would reduce the scope hugely)
However, it would have the same issue in that not all PRR's bind to proteins. These are host proteins to recognise pathogens and they bind to some host proteins:
"In addition to the well-characterized PAMPs flagellin, EF-Tu, and chitin, many pathogen-secreted metabolites or virulence proteins can also activate the plant immune system; these include lipopolysaccharides, peptidoglycan volatiles, glycoproteins, cell wall degradation enzymes (CWDE), and other pathogen-secreted proteins (Liu et al., 2012; Ranf et al., 2015)."
So, in an ideal world, we would be able to select either a protein (which will be one of the gene products we are annotating OR a CHEBI molecule.
I think we might already be able to do this? I recollect that we had a similar x or y option for a gene or a SO term for some extensions? Is this possible?
Yes, the syntax does support "x or y" - e.g. the lines with "TranscriptID|SO:0000673" in the FYPO extension config.
Are there any descendants of GO:0038187 where a CHEBI ID would be redundant with the descendant's term name & def (there are oodles for the parent GO:0038023)? Not that I want to get into the weeds too much .. just trying to stay away from the other extreme of "oops we didn't think that through".
Quite possibly but I suspect there will be some GO changes here (I will recommend that specific substrates are captures with CHEBI).
PomBase is unlikely to use this term ever, but in PHI-base we will use it a lot. So I can monitor and refine once the GO churn is over. It should become obvious quite quickly if something is bonkers. I think it will be OK, just not as refined as it could be.
OK, I've put GO:0038187 in the PomBase MF extension config. For ease of copying over to PHI-etc, this is the line:
GO:0038187 is_a has_input ProteinID|CHEBI:24431 has ligand (protein or chemical substance) 0,1 user
@jseager7 is going to make the PHI ones point to ours for GO because they should be the same . James is there a ticket for this?
James is there a ticket for this?
Not yet, but I'll open one on our config repository because it belongs there. I'll link here when it's done.
I've got a question about the subset relations (listed below) in the GO extension config files: where do they come from? I'm presuming they need definitions in a corresponding ontology, so do they already exist in GO? Will we need to load any extra ontologies into PHI-Canto to enable these?
We're currently loading has_qualifier_range.obo and fypo_extension.obo from the PomBase file server, plus the standards like GO-basic, PSI-MOD, and RO (and all of the PHI-base-related ontologies that we need).
This is a very good question. This is the part that is 'in flux'. Go are trying to consolidate the relations and reduce the set used. Then they will be migrated to ~BFO~ RO.
We don't really need definitions for them in Canto as the user is protected from needing to select the appropriate relation, we do it for them in the config.
Obviously good to have once the dust settles, but we don't absolutely need them to proceed.
Will we need to load any extra ontologies into PHI-Canto to enable these?
These relations don't need an ontology. Canto just needs them to be in the config file.
These relations don't need an ontology. Canto just needs them to be in the config file.
But is the extra information about them (definitions, comments, etc.) supplied from a companion ontology?
@ValWood I've added assayed_using on PHIPO:0000164 (pathogen host protein-protein interaction phenotype) and I've updated our GO extension config to match what's on pombase/pombase-config. The GO config isn't being synced automatically yet, but I'm planning to do that.
The extensions should update once the server reloads the ontologies overnight.
If that's all you need, feel free to close this issue.
ACtually I will close this. There is a note in the session to add the substrate to the GO receptor (this change is not through yet)
The PHIPO interactions seem to allow assayed using. I did not see the bit about 2 interactors, but we use this quite a lot so it will soon be obvious if it isn't working.
But is the extra information about them (definitions, comments, etc.) supplied from a companion ontology?
We don't use the definitions and comments anywhere in Canto so they don't need to be loaded.
@jseager7 eventually all of the relations used by GO extensions will be in ~BFO~ RO. At present GO use an internal ontology. Once they decide which relations are "acceptable" in GO extensions, missing relations will be added to ~BFO~ RO.
More info is here but this is also out of date http://wiki.geneontology.org/index.php/Annotation_Extensions
The PHIPO interactions seem to allow assayed using. I did not see the bit about 2 interactors, but we use this quite a lot so it will soon be obvious if it isn't working.
@ValWood The extension might not have been working correctly because I forgot to restart Canto after updating the config. Should be fine now, although note that I think we have overlapping configurations on some terms, which means you'll see both the cardinality 2 protein extension and the cardinality 1 extension:
I've opened an issue about this on the config repository.
eventually all of the relations used by GO extensions will be in BFO.
corrected in comment above - relations are going into the Relations Ontology (RO)
Should be fine now, although note that I think we have overlapping configurations on some terms, which means you'll see both the cardinality 2 protein extension and the cardinality 1 extension:
@mah is there a way to fix this in config?
although note that I think we have overlapping configurations on some terms, which means you'll see both the cardinality 2 protein extension and the cardinality 1 extension:
I think the way to fix that is to change the domain to something like:
GO:0034613-is_a(GO:0006886)
and the have a separate configuration with the excluded subset (eg. "GO:0006886
") as the domain.
For example was have this in the GO config:
GO:0006886 is_a has_input ProteinID transports 0,1 user
GO:0034613-is_a(GO:0006886) is_a has_input ProteinID localizes 0,1 user
I hope that helps. I'm happy to chat on Skype about this.
Thanks @kimrutherford, but fortunately it wasn't necessary to add excluded subsets, because I was able to fix the problem by removing redundant domain IDs.
GO:0038023 signaling receptor activity needs has_input
(Note to self , fix PMID:30610168)