INCATools / ontology-access-kit

Ontology Access Kit: A python library and command line application for working with ontologies
https://incatools.github.io/ontology-access-kit/
Apache License 2.0
118 stars 28 forks source link

`relationships`: (i) subClassOf only default, (ii) `--predicates` failure? #613

Closed joeflack4 closed 1 year ago

joeflack4 commented 1 year ago

Overview

I'm trying to get a list of relationships for a term, but I find that (i) when I don't list any --predicates, it only shows me subClassOf relationships, and (ii) when I do list explicit --predicates, I get nothing back in response.

See here where I'm creating the ontology file and running this command: https://github.com/monarch-initiative/mondo/pull/6451#discussion_r1264775511

    <owl:Class rdf:about="http://purl.obolibrary.org/obo/MONDO_0000004">
        <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/MONDO_0002816"/>
        <rdfs:subClassOf>
            <owl:Restriction>
                <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0004024"/>
                <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/GO_0034651"/>
            </owl:Restriction>
        </rdfs:subClassOf>
        <rdfs:subClassOf>
            <owl:Restriction>
                <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0004026"/>
                <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/UBERON_0001235"/>
            </owl:Restriction>
        </rdfs:subClassOf>
        <obo:IAO_0000115>An endocrine or hormonal disorder that occurs when the adrenal cortex does not produce enough of the hormone cortisol and in some cases, the hormone aldosterone. It may be due to a disorder of the adrenal cortex (Addison&apos;s disease or primary adrenal insufficiency) or to inadequate secretion of ACTH by the pituitary gland (secondary adrenal insufficiency).</obo:IAO_0000115>
        <oboInOwl:hasDbXref>DOID:10493</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>GARD:0006722</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>ICD9:255.4</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>ICD9:255.41</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>MESH:D000309</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>NCIT:C26691</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>SCTID:154707007</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>SCTID:190527008</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>SCTID:267398003</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>SCTID:267483004</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>SCTID:386584007</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>SCTID:68588005</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>UMLS:C0001623</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>UMLS:C0405580</oboInOwl:hasDbXref>
        <oboInOwl:hasExactSynonym>adrenal cortical hypofunction</oboInOwl:hasExactSynonym>
        <oboInOwl:hasExactSynonym>adrenal cortical insufficiency</oboInOwl:hasExactSynonym>
        <oboInOwl:hasExactSynonym>adrenal gland insufficiency</oboInOwl:hasExactSynonym>
        <oboInOwl:hasExactSynonym>adrenal insufficiency</oboInOwl:hasExactSynonym>
        <oboInOwl:hasExactSynonym>adrenocortical insufficiency</oboInOwl:hasExactSynonym>
        <oboInOwl:hasExactSynonym>corticoadrenal insufficiency</oboInOwl:hasExactSynonym>
        <oboInOwl:hasExactSynonym>hypocortisolemia</oboInOwl:hasExactSynonym>
        <oboInOwl:hasExactSynonym>hypocortisolism</oboInOwl:hasExactSynonym>
        <oboInOwl:hasRelatedSynonym>hypoadrenalism</oboInOwl:hasRelatedSynonym>
        <oboInOwl:id>MONDO:0000004</oboInOwl:id>
        <rdfs:label>adrenocortical insufficiency</rdfs:label>
    </owl:Class>

(i) subClassOf only default runoak -i sqlite:tmp/mondo-edit.db relationships -p i MONDO:0000004

Result:

subject subject_label   predicate   predicate_label object  object_label
MONDO:0000004   adrenocortical insufficiency    rdfs:subClassOf None    MONDO:0002816   adrenal cortex disorder

(ii) --predicates failure? runoak -i sqlite:tmp/mondo-edit.db relationships -p i MONDO:0000004 --predicates oboInOwl:hasDbXref skos:exactMatch skos:narrowMatch skos:broadMatch skos:relatedMatch

Result: None

cmungall commented 1 year ago

OAK distinguishes between Relationships and Mapping.

See also the more detailed elucidation in the OAK guide

The command is behaving as expected, as dbXref isn't a relationship.

You can query for mappings with the mappings command. This follows the SSSOM data model

$ mondo mappings MONDO:0000004 -O sssom
# curie_map:
#   <http: http://w3id.org/sssom/unknown_prefix/<http/
#   DOID: http://purl.obolibrary.org/obo/DOID_
#   GARD: http://purl.obolibrary.org/obo/GARD_
#   ICD9: http://w3id.org/sssom/unknown_prefix/icd9/
#   MESH: http://id.nlm.nih.gov/mesh/
#   MONDO: http://purl.obolibrary.org/obo/MONDO_
#   NCIT: http://purl.obolibrary.org/obo/NCIT_
#   SCTID: http://www.snomedbrowser.com/Codes/Details/
#   UMLS: http://linkedlifedata.com/resource/umls/id/
#   oio: http://www.geneontology.org/formats/oboInOwl#
#   owl: http://www.w3.org/2002/07/owl#
#   rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
#   rdfs: http://www.w3.org/2000/01/rdf-schema#
#   semapv: https://w3id.org/semapv/
#   skos: http://www.w3.org/2004/02/skos/core#
#   sssom: https://w3id.org/sssom/
# license: https://w3id.org/sssom/license/unspecified
# mapping_set_id: temp
subject_id  subject_label   predicate_id    object_id   mapping_justification   subject_source  object_source
MONDO:0000004   adrenocortical insufficiency    oio:hasDbXref   DOID:10493  semapv:UnspecifiedMatching  MONDO   DOID
MONDO:0000004   adrenocortical insufficiency    oio:hasDbXref   GARD:0006722    semapv:UnspecifiedMatching  MONDO   GARD
MONDO:0000004   adrenocortical insufficiency    oio:hasDbXref   ICD9:255.4  semapv:UnspecifiedMatching  MONDO   ICD9
MONDO:0000004   adrenocortical insufficiency    oio:hasDbXref   ICD9:255.41 semapv:UnspecifiedMatching  MONDO   ICD9
MONDO:0000004   adrenocortical insufficiency    oio:hasDbXref   MESH:D000309    semapv:UnspecifiedMatching  MONDO   MESH
MONDO:0000004   adrenocortical insufficiency    oio:hasDbXref   NCIT:C26691 semapv:UnspecifiedMatching  MONDO   NCIT
MONDO:0000004   adrenocortical insufficiency    oio:hasDbXref   SCTID:386584007 semapv:UnspecifiedMatching  MONDO   SCTID
MONDO:0000004   adrenocortical insufficiency    oio:hasDbXref   UMLS:C0405580   semapv:UnspecifiedMatching  MONDO   UMLS
MONDO:0000004   adrenocortical insufficiency    skos:exactMatch <http://identifiers.org/mesh/D000309>   semapv:UnspecifiedMatching  MONDO   <http
MONDO:0000004   adrenocortical insufficiency    skos:exactMatch <http://identifiers.org/snomedct/386584007> semapv:UnspecifiedMatching  MONDO   <http
MONDO:0000004   adrenocortical insufficiency    skos:exactMatch <http://linkedlifedata.com/resource/umls/id/C0405580>   semapv:UnspecifiedMatching  MONDO   <http
MONDO:0000004   adrenocortical insufficiency    skos:exactMatch DOID:10493  semapv:UnspecifiedMatching  MONDO   DOID
MONDO:0000004   adrenocortical insufficiency    skos:exactMatch NCIT:C26691 semapv:UnspecifiedMatching  MONDO   NCIT

(comments on the apparent duplication here #611)

Currently the mappings command doesn't allow you to filter by mapping predicate, open an issue if you want this, it should be easy to add.

joeflack4 commented 1 year ago

Thanks @cmungall . Yesterday I tried out a bunch of different commands, and I also found that mappings is what I needed.

On a philosophical / design note, I can conceive of why one would not consider mapping predicates, or oboInOwl:hasDbXref to be a relationship. But I want to mention a couple things just in case they have not been considered (1) or overlooked (2):

  1. A lot of people would consider mappings to be a subset of relationships. As one example, they're considered such in OMOP. There is a relationship table, with a relationship_type column, and all kinds of mapping predicates are there.
  2. If my literal reading of relationship in the OAK glossary is correct, it would seem that these mapping predicates should be included. It doesn't qualify a subset of predicates (though the distinction is clearer in the relationships guide docs):

    A Relationship is a type connection between two ontology elements. The first element is called the Subject, and the second one the Object, with the type of connection being the Predicate. Sometimes Relationships are equated with Triples in RDF but this can be confusing, because some relationships map to multiple triples when following the OWL RDF serialization. An example is the relationship “finger part-of hand”, which in OWL is represented using a Existential Restriction that maps to 4 triples.

joeflack4 commented 1 year ago

Another potential request. Would it not be too difficult to raise an error if the --predicates a user passes includes something unexpected? It sounds like relationships knows precisely what subset of predicates it expects. Could we not check against that same subset to raise an error?

If the subset is not too large, it could be included in the docs as well.

cmungall commented 1 year ago

Great idea!

On Mon, Jul 17, 2023 at 4:53 PM Joe Flack @.***> wrote:

Another potential request. Would it not be too difficult to raise an error if the --predicates a user passes includes something unexpected? It sounds like relationships knows precisely what subset of predicates it expects. Could we not check against that same subset to raise an error?

If the subset is not too large, it could be included in the docs as well.

— Reply to this email directly, view it on GitHub https://github.com/INCATools/ontology-access-kit/issues/613#issuecomment-1639051106, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOLNLVWXDZUUFXOZ5Y3XQXGBLANCNFSM6AAAAAA2MIXLWI . You are receiving this because you were mentioned.Message ID: @.***>

joeflack4 commented 1 year ago

Thanks @cmungall. I also wanted you to check out this other comment as well: https://github.com/INCATools/ontology-access-kit/issues/613#issuecomment-1638914583 , particularly how OMOP considers relationships, etc