althonos / pronto

A Python frontend to (Open Biomedical) Ontologies.
https://pronto.readthedocs.io
MIT License
226 stars 47 forks source link

How to extract all the terms from all the properties and the instances in a ontology? #184

Open ali3assi opened 1 year ago

ali3assi commented 1 year ago

I am a new user of pronto. I like it simplicity.

I am trying to exctract all the terms in an ontology (.owl). Let say we have the following ontology:

<?xml version="1.0"?>
<rdf:RDF xmlns="http://purl.obolibrary.org/obo/eco.owl#"
     xml:base="http://purl.obolibrary.org/obo/eco.owl"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:eco="http://purl.obolibrary.org/obo/eco#"
     xmlns:obo="http://purl.obolibrary.org/obo/"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:xml="http://www.w3.org/XML/1998/namespace"
     xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
     xmlns:terms="http://purl.org/dc/terms/"
     xmlns:oboInOwl="http://www.geneontology.org/formats/oboInOwl#">
    <owl:Ontology rdf:about="http://purl.obolibrary.org/obo/eco.owl">
        <owl:versionIRI rdf:resource="http://purl.obolibrary.org/obo/eco/releases/2022-05-27/eco.owl"/>
        <dc:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string">The Evidence &amp; Conclusion Ontology (ECO) describes types of scientific evidence within the biological research domain that arise from laboratory experiments, computational methods, literature curation, or other means.</dc:description>
        <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Evidence &amp; Conclusion Ontology (ECO)</dc:title>
        <terms:license rdf:resource="https://creativecommons.org/publicdomain/zero/1.0/"/>
        <oboInOwl:date rdf:datatype="http://www.w3.org/2001/XMLSchema#string">27:05:2022 15:33</oboInOwl:date>
        <oboInOwl:default-namespace rdf:datatype="http://www.w3.org/2001/XMLSchema#string">eco</oboInOwl:default-namespace>
        <rdfs:comment rdf:datatype="http://www.w3.org/2001/XMLSchema#string">ECO (https://github.com/evidenceontology/evidenceontology) is released into the public domain under CC0 1.0 Universal (CC0 1.0). Anyone is free to copy, modify, or distribute the work, even for commercial purposes, without asking permission. Please see the Public Domain Dedication (https://creativecommons.org/publicdomain/zero/1.0/) for an easy-to-read description of CC0 1.0 or the full legal code (https://creativecommons.org/publicdomain/zero/1.0/legalcode) for more detailed information. To get a sense of why ECO is CC0 as opposed to licensed under CC-BY, please read this thoughtful discussion (https://github.com/OBOFoundry/OBOFoundry.github.io/issues/285) on the OBO Foundry GitHub site.</rdfs:comment>
    </owl:Ontology>

    <owl:Class rdf:about="http://purl.obolibrary.org/obo/GO_0061041">
        <oboInOwl:id>GO:0061041</oboInOwl:id>
        <rdfs:label>regulation of wound healing</rdfs:label>
        <rdfs:comment>Bill Gates </rdfs:comment>
    </owl:Class>    

    </rdf:RDF>

In the main function I write like that:

ontology_path = "PATH TO THE ONTOLOGY"
myOnto= Ontology(ontology_path)
for term in myOnto.terms():
     print(term)

I get only the Term in the label property: Term('GO:0061041', name='regulation of wound healing')

However, How can also extract the term in rdfs:comment property? It looks like we can only use the label's property.

Similarly how can extract terms from instances in the ontology. Let say:

  <owl:NamedIndividual rdf:about="http://purl.obolibrary.org/obo/ECO_0000263">
....    </owl:NamedIndividual>