Open pfey03 opened 6 years ago
@pfey03 Can you confirm which field and interface (Graph or Form) that you were using?
@cmungall This should just as simple as adding to the NEO imports, right?
I tried it in the graph editor, but don't think the form would have SO? It didn't even take the protein entity (H2Bv3) I want to add the SO terms to in the form.
This should just as simple as adding to the NEO imports, right?
Yes, easy to add SO, but we need to first be sure that we have the usage of the ontology documented. AFAIK we don't have any curator documentation for use of SO here, @vanaukenk
The use of SO biological_region hierarchy to describe sites on proteins etc seems reasonable but want to check we are all doing things the same way.
@pfey03 - For this example, can you give us the full annotation you'd like to make, e.g. what the MF term is and the relations and extensions?
That will help us understand better what you're trying to state with this annotation. Thx!
@vanaukenk Oh wow, yes this issue! Here is the annotation line from GPAD: UniProtKB Q54XI2 enables GO:1990404 PMID:28252050 ECO:0000314 20180529 dictyBase occurs_at(SO:0100014),occurs_at(SO:0001454),has_input(UniProtKB:Q54LP8) goEvidence=IDA
Thanks!
GREEKC also wants to use SO as 'has input' for transcription factors.
There are some potential issues here. We have been treating SO as being molecular entities. @mikebada and @msinclair2 have been making https://github.com/The-Sequence-Ontology/MSO, the idea being that when this is released SO will be "re-declared" as being information entities. This will render all such annotations as invalid since inputs/outputs must be physical. This is not an idle philosophical issue, the domain/range constraints in RO which will cause these to show up as invalid.
See also: https://github.com/The-Sequence-Ontology/MSO/issues/5
There's been a lot of conflation for a while: The SO sequence entities in the last major SO journal article were described to be GDCs, though it was also said that the qualities don't make sense for the GDCs. Additionally, the current SO has some classes that really don't make sense as molecular entities (e.g., read, contig), but many/most of the natural-language definitions of the classes of the current SO are for the independent continuants. So, GDC/IC deconflation has been a primary motivation for this refactoring work.
In the refactored versions, the sequence entities of the MSO and SO are almost entirely parallel, with the exception that the relatively small number of classes that don't really make sense as molecular entities will only exist in the SO. Additionally, the SO classes are directly defined in terms of (generically dependent on) their corresponding MSO classes, e.g., SO:gene generically_dependent_on some MSO:gene. However, corresponding SO and MSO classes have the same ID, just different namespaces, e.g., for gene, SO:0000704 and MSO:0000704. So, for those resources that need to refer to molecular entities, hopefully much of the migration can be taken care of by simply replacing the SO namespace with MSO.
@pgaudet I made a presentation directly to GREEKC in Hinxton this April describing the MSO and SO and that MSO should be used for molecular interactions. So I believe they should be already aware of the issue and the solution with MSO.
@pfey - thanks for the example.
In this case, it looks like you are trying to capture the general region of the protein that is modified (e.g. ADP-ribosylated) by capturing that the modification _occursat an amino acid in the N-terminal region of the substrate (H2Bv3). Is that correct?
_occursat is not currently in RO and so is not available to use in Noctua. In models where similar annotations have been created, curators have used PRO ids to capture inputs and outputs of enzymatic activities where they wanted to specifically indicate what the unmodified and modified inputs and outputs were. See for example: http://noctua.geneontology.org/editor/graph/gomodel:581e072c00000473
@thomaspd @cmungall @pgaudet I don't think we want to use _occursat in this way in GO-CAM models. We should review other cases of _occursat to see how that information would be modeled in Noctua. Paul, do you already have any examples for this?
Leaving this ticket open, as this is still a need for some curators.
Also, I could easily use has_input. These annotations were quite old when it was still possible in P2GO. I would not occurs_at anymore for these
In my P2GO annotations I wanted to model I have some SO extensions, such as SO:0100014 n-terminal region (of the has_input H2Bv3 Ddis)
I noticed I cannot add these in noctua. Also for model gomodel:5ae3b0f600000395
Thanks, Petra