EBISPOT / ols4

Version 4 of the EMBL-EBI Ontology Lookup Service (OLS)
http://www.ebi.ac.uk/ols4/
Apache License 2.0
38 stars 17 forks source link

MS ontology - Unexpected behaviour/problems with the display in OLS4 #600

Closed javizca closed 7 months ago

javizca commented 9 months ago

I am reporting a bug is related to an expected behaviour in the MS ontology (https://www.ebi.ac.uk/ols4/ontologies/ms) in the OLS4 (did not happen before in OLS3).

The root level is kind of a mess with many terms apparently now sitting at the root level when they didn’t used to be. Looking a little closer, it seems that part_of relationships are not working. For example:

image001

Acquisition parameter now appears at the root level. The obo has it as:

[Term] id: MS:1001954 name: acquisition parameter def: "Parameters used in the mass spectrometry acquisition." [PSI:MS] relationship: part_of MS:1001458 ! spectrum generation information relationship: has_value_type xsd:string ! The allowed value-type for this CV term

The OLS interface even seems to acknowledge that it is a part_of:

image002

So then why does it put the term at the root level? And not display it with a “P” in the tree like OLS3 used it? This makes navigating the ontology very difficult.

Please let me know if you need any further information.

jamesamcl commented 8 months ago

This is because MS defines its own part_of relationship instead of using the standard one from RO.

OLS3 used to look for any properties with the label "part of" and display them as parthood in the tree, but we are trying to move away from assigning semantics based on labels.

There are two ways this could be fixed:

  1. Override the part of property in the config (I think this is still possible in OLS4, CC @henrietteharmse ?)

  2. Use RO's part_of property in MS

edeutsch commented 8 months ago

So in the following example then:

[Term]
id: MS:1003056
name: adduct ion property
def: "Physical measurable characteristic of an adduct ion." [PSI:PI]
relationship: part_of MS:1000353 ! adduct ion

what do you suggest the last line should be?

I just pulled down the latest Gene Ontology OBO file and they seem to be going the same thing?

[Term]
id: GO:0000080
name: mitotic G1 phase
namespace: biological_process
def: "The cell cycle 'gap' phase which is the interval between the completion of DNA segregation by mitosis and the beginning of DNA synthesis." [GOC:mtg_cell_cycle]
comment: Note that this term should not be used for direct annotation. If you are trying to make an annotation to x phase, it is likely that the correct annotation is 'regulation of x/y phase transition' or to a process which occurs during the reported phase (i.e mitotic DNA replication for mitotic S-phase). To capture the phase when a specific location or process is observed, the phase term can be used in an annotation extension (PMID:24885854) applied to a cellular component term (with the relation exists_during) or a biological process term (with the relation happens_during).
subset: gocheck_do_not_annotate
synonym: "G1 phase of mitotic cell cycle" EXACT []
is_a: GO:0051318 ! G1 phase
relationship: part_of GO:0051329 ! mitotic interphase

And I'll note that BioPortal and OntoBee seem to have no problem displaying these part_ofs.

And I just looked in EFO:

[Term]
id: EFO:0004167
name: RAPD
def: "RAPD is a type of PCR reaction in which the segments of DNA are amplified at random using randomly generated primers." []
synonym: "random amplification of polymorphic DNA" EXACT []
synonym: "random PCR" EXACT []
is_a: OBI:0000415 ! PCR
relationship: part_of EFO:0004184 ! nucleic acid library construction protocol
property_value: IAO:0000117 "James Malone" xsd:string

Same thing.

Or are you suggesting that if PSI-MS were to remove this thing:

[Typedef]
id: part_of
name: part_of
is_transitive: true

then it would work fine?

Or alternatively, can you explain when your option 2 means? I don't understand.

edeutsch commented 7 months ago

@jamesamcl @henrietteharmse Can you help with a solution here?

ypriverol commented 7 months ago

Dear @henrietteharmse the issue is still present. Take for example peptide attribute still is presented in the root of the ontology (figure 1), however when you click on it, it goes directly into peptide category (figure 2).

Figure 1:

Screenshot 2024-01-23 at 15 02 50

Figure 2:

Screenshot 2024-01-23 at 15 04 26

I guess, what we are expecting is that peptide attribute is not shown in the root but only as a child term of peptide

henrietteharmse commented 7 months ago

@ypriverol

For this to work the way you want it to work, OLS will need to delete relations - which it never does. If you view this in Protege, you will see that peptide attribute is indeed defined at the root. Then OLS adds additional hierarchical properties (like part_of), which cannot be seen in the same way in Protege.

This is exactly the same for other ontologies as well, like EFO. I.e., liver is in part_of relations with numerous concepts. Note the unexpanded hierarchy part of liver reflects exactly what is seen in Protege.
image. If you click on liver the detail is shown.

There are lots of issues with MS that should be resolved by the ontology creators. This has been raised some time ago already with MS. For reference see here is the original issue from OLS3: https://github.com/EBISPOT/OLS/issues/623. I have just again checked MS to ensure that it is reasoned (because if it unreasoned some of the structure of the tree can be lost), but then found the exact same unsatisfiable classes as was noted in 2022. image

MS needs to fix these issues.

ypriverol commented 7 months ago

I think, I found the issue. When we automatically convert the part_of in the OBO to OWL the conversion do not pick that the term is a SubClass of the parent term similar to what happen with is_a. In contrast, it annotates the term with a strange relation like the term is_a subclass of part_of which could be a subclass of many things.

You are completely right @henrietteharmse we have to fix this. I see three options @edeutsch :

Thanks, @henrietteharmse, for your time and support, we will try to take care from here as it has been previously suggetsed.

edeutsch commented 7 months ago

Hi, thanks, everyone. Regarding options:

(1) These seems like the right way to go, although I do not know how. We use robot for conversion, so I would have thought that it would do it correctly.

(2) Our current process is centered around GitHub PRs of simple text formats (OBO). Migrating to Protege and OWL seems like a hard lift for our group.

(3) This seems like fixing a bolt that won't quite fit through a nut with a hammer. In the above example, saying that a "liver is_a abdomen" instead of a "liver part_of abdomen" seems like a massive destruction of our ontology to make it display a little better in OLS.

edeutsch commented 7 months ago

MS needs to fix these issues.

A substantial difficulty is that MS does not know how to fix these issues.

henrietteharmse commented 7 months ago

As far as I can see, MS is an OBO ontology. You should be able to reach out to them and follow their guidance as to how to create the ontology. They are using the ODK, ROBOT and Protege to create their ontologies. They also have some quality control measure which are likely to benefit MS. The big change in the OBO community is that they have moved away from .obo files in favour of .owl files.

matentzn commented 7 months ago

I think this is all that is needed: https://github.com/HUPO-PSI/psi-ms-CV/pull/232/files

henrietteharmse commented 7 months ago

I would be weary of doing a blanket change of part_of to is_a. @edeutsch is spot on in this regard.

matentzn commented 7 months ago

I would be weary of doing a blanket change of part_of to is_a. @edeutsch is spot on in this regard.

Not just weary, this would be really wrong IMO..

henrietteharmse commented 7 months ago

I would be weary of doing a blanket change of part_of to is_a. @edeutsch is spot on in this regard.

Not just weary, this would be really wrong IMO..

It will be wrong, if the part_of relations are indeed correct. I am unsure whether they are all correct though.

lgatto commented 4 months ago

It looks like the MS ontology isn't available any more:

Is this related/a consequence of this issue? Is this temporary, or has it been removed from OLS?

henrietteharmse commented 4 months ago

No this is an unrelated issue. The format of MS got messed up in the MS github repo, which @ypriverol rolled back. MS should be back on OLS by tomorrow. For exact details please chat to @ypriverol.

ypriverol commented 1 month ago

Hi @henrietteharmse MS is again not available at OLS. I was looking into the commits in the PSI-MS in the last couple of weeks and nothing major has been done. Can you let me know what is the error? I also added an issue in the PSI-MS https://github.com/HUPO-PSI/psi-ms-CV/issues/292. Do you think could be possible to implement the following logic in OLS, if the latest update of an ontology fails, keep the previous version. I think at least for PRIDE would be better to have an old version that nothing.

haideriqbal commented 1 month ago

Had a discussion with @ypriverol over slack. Just to summarise it here, currently MS ontology is not available in OLS because one of the imports (http://purl.obolibrary.org/obo/stato.owl) of MS ontology is not in RDF format.

@ypriverol has created a workflow (https://github.com/HUPO-PSI/psi-ms-CV/pull/295) in MS ontology repo will let them know if the latest version of MS can be processed in OLS or not.