owlcs / owlapi

OWL API main repository
813 stars 314 forks source link

OWLAPI injects rdfs:label assertions for oboInOwl when converting from obo format #1134

Closed cmungall closed 2 months ago

cmungall commented 2 months ago

At some point in the recent history of the OWLAPI it started injecting rdfs:label assertions when converting from obo format

E.g.

format-version: 1.4
ontology: comment

[Term]
id: X:1
comment: "This is a comment about term X:1."

generates

<?xml version="1.0"?>
<rdf:RDF xmlns="http://purl.obolibrary.org/obo/comment.owl#"
     xml:base="http://purl.obolibrary.org/obo/comment.owl"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:xml="http://www.w3.org/XML/1998/namespace"
     xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
     xmlns:oboInOwl="http://www.geneontology.org/formats/oboInOwl#">
    <owl:Ontology rdf:about="http://purl.obolibrary.org/obo/comment.owl">
        <oboInOwl:hasOBOFormatVersion>1.4</oboInOwl:hasOBOFormatVersion>
    </owl:Ontology>

    <!-- 
    ///////////////////////////////////////////////////////////////////////////////////////
    //
    // Annotation properties
    //
    ///////////////////////////////////////////////////////////////////////////////////////
     -->

    <!-- http://www.geneontology.org/formats/oboInOwl#hasOBOFormatVersion -->

    <owl:AnnotationProperty rdf:about="http://www.geneontology.org/formats/oboInOwl#hasOBOFormatVersion">
        <rdfs:label>has_obo_format_version</rdfs:label>
    </owl:AnnotationProperty>

    <!-- http://www.geneontology.org/formats/oboInOwl#id -->

    <owl:AnnotationProperty rdf:about="http://www.geneontology.org/formats/oboInOwl#id">
        <rdfs:label>id</rdfs:label>
    </owl:AnnotationProperty>

    <!-- http://www.w3.org/2000/01/rdf-schema#comment -->

    <owl:AnnotationProperty rdf:about="http://www.w3.org/2000/01/rdf-schema#comment"/>

    <!-- 
    ///////////////////////////////////////////////////////////////////////////////////////
    //
    // Classes
    //
    ///////////////////////////////////////////////////////////////////////////////////////
     -->

    <!-- http://purl.obolibrary.org/obo/X_1 -->

    <owl:Class rdf:about="http://purl.obolibrary.org/obo/X_1">
        <oboInOwl:id>X:1</oboInOwl:id>
        <rdfs:comment>&quot;This is a comment about term X:1.&quot;</rdfs:comment>
    </owl:Class>
</rdf:RDF>

<!-- Generated by the OWL API (version 4.5.26) https://github.com/owlcs/owlapi -->

This isn't part of the specification. We can change the spec to reflect current behavior but I would rather this behavior is removed

Originally posted by @cmungall in https://github.com/owlcs/owlapi/issues/1102#issuecomment-2089088246

ignazio1977 commented 2 months ago

Just for clarity, the extra labels are these two?

    <rdfs:label>has_obo_format_version</rdfs:label>

    <rdfs:label>id</rdfs:label>
balhoff commented 2 months ago

I added id, created_by, and creation_date to the OBO constants enum in this PR: https://github.com/owlcs/owlapi/pull/1099/files#diff-624125d6c0fdd02198cff3e52ece7ed9a49652cc11b4b68bf8a7970a9e35930d

But I didn't make any changes to has_obo_format_version, so that's a little confusing.

cmungall commented 2 months ago

@ignazio1977 - correct

OK so it looks like this sort of behavior has been in the OWLAPI for some properties for quite some time. Using 4.5.6:

format-version: 1.4
ontology: comment

[Term]
id: X:1
name: x1
def: "x" [foo:1]
comment: "This is a comment about term X:1."
xref: Y:1

yields 3 injections

# Annotation Property: <http://purl.obolibrary.org/obo/IAO_0000115> (definition)

AnnotationAssertion(rdfs:label <http://purl.obolibrary.org/obo/IAO_0000115> "definition"^^xsd:string)

# Annotation Property: <http://www.geneontology.org/formats/oboInOwl#hasDbXref> (database_cross_reference)

AnnotationAssertion(rdfs:label <http://www.geneontology.org/formats/oboInOwl#hasDbXref> "database_cross_reference"^^xsd:string)

# Annotation Property: <http://www.geneontology.org/formats/oboInOwl#hasOBOFormatVersion> (has_obo_format_version)

AnnotationAssertion(rdfs:label <http://www.geneontology.org/formats/oboInOwl#hasOBOFormatVersion> "has_obo_format_version"^^xsd:string)

the latest

# Annotation Property: <http://purl.obolibrary.org/obo/IAO_0000115> (definition)

AnnotationAssertion(rdfs:label <http://purl.obolibrary.org/obo/IAO_0000115> "definition")

# Annotation Property: <http://www.geneontology.org/formats/oboInOwl#hasDbXref> (database_cross_reference)

AnnotationAssertion(rdfs:label <http://www.geneontology.org/formats/oboInOwl#hasDbXref> "database_cross_reference")

# Annotation Property: <http://www.geneontology.org/formats/oboInOwl#hasOBOFormatVersion> (has_obo_format_version)

AnnotationAssertion(rdfs:label <http://www.geneontology.org/formats/oboInOwl#hasOBOFormatVersion> "has_obo_format_version")

# Annotation Property: <http://www.geneontology.org/formats/oboInOwl#id> (id)

AnnotationAssertion(rdfs:label <http://www.geneontology.org/formats/oboInOwl#id> "id")

so @balhoff's changes just make things more consistent, so I'm less troubled than I was. I would still like the id label gone but this is more aesthetic