protegeproject / protege

Protege Desktop
http://protege.stanford.edu
Other
995 stars 231 forks source link

DCTerms Load Loop Error #1226

Open ohkio opened 1 month ago

ohkio commented 1 month ago

Hello Protege team!

I wanted to tinker a little bit with protege and try it using the DC Terms ontology/rdf. When I did I noticed an odd loop that would happen between loading a file and saving a file. First, I am using the Linux distribution if that has any bearing. Version 5.6.4.

Here are the steps I'm doing to recreate the error fresh. First, I load up protege by running ./run.sh in the terminal like the install documentation suggests. A window shows up. It opens a new ontology and for these recreation I am renaming the ontology 'test'

2024-07-25_14-33

Then I click on the plus next to 'direct imports' and I follow the wizzard:

The first thing that I notice that is a little odd, is that this ontology is given a strange name. The name can sometimes change, but the format is: "OntologyID(Anonymous-#)"

2024-07-25_14-40

The correct classes, properties, individuals, and datatypes are all there, but they all say that they are in this "OntologyID(Anonymous-#)" whereas in a different test, when I imported an ontology like skos it was able to use the shorthand 'skos'

2024-07-25_14-43

I recognize this naming might be a problem on the dcterms side, but I bring it up just in case because of the loop I'm in the middle of describing.

Next step is to simply save this file as is. It will be a very small and short ontology, but it will still show my point.

To save I am doing the following:

Now I close protege.

When I look at the XML file I made, the file is unsurprisingly very short:

<?xml version="1.0"?>
<rdf:RDF xmlns="http://www.semanticweb.org/*****/ontologies/2024/6/test/"
     xml:base="http://www.semanticweb.org/*****/ontologies/2024/6/test/"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:xml="http://www.w3.org/XML/1998/namespace"
     xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
     xmlns:dcam="http://purl.org/dc/dcam/"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
     xmlns:terms="http://purl.org/dc/terms/">
    <owl:Ontology rdf:about="http://www.semanticweb.org/*****/ontologies/2024/6/test">
        <owl:imports rdf:resource="http://purl.org/dc/terms/"/>
    </owl:Ontology>

<!-- Generated by the OWL API (version 4.5.29.2024-05-13T12:11:03Z) https://github.com/owlcs/owlapi -->

For privacy I am censoring the name associated with the account, but that shouldn't be relevant. The line that I notice the most that I am focusing on is the line:

<owl:imports rdf:resource="http://purl.org/dc/terms/"/>

Now I start protege back up using the same method I mentioned before and I open this file back up:

I immediately notice some differences. First, the 'Direct Imports' no longer lists dcterms:

2024-07-25_14-53

And now some of the elements that were originally from dcterms are in bold text in the class list:

2024-07-25_14-54

Now I create a single individual:

Now when I save the file the normal way:

And I go to close protege.

Now when I look at the rdf file, all of dcterms is recorded in my file and the owl:import is gone. Here is the first bit of the file to give an idea, but the file is huge now:

<?xml version="1.0"?>
<rdf:RDF xmlns="http://www.semanticweb.org/*****/ontologies/2024/6/test/"
     xml:base="http://www.semanticweb.org/*****/ontologies/2024/6/test/"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:xml="http://www.w3.org/XML/1998/namespace"
     xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
     xmlns:dcam="http://purl.org/dc/dcam/"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
     xmlns:terms="http://purl.org/dc/terms/">
    <owl:Ontology rdf:about="http://www.semanticweb.org/*****/ontologies/2024/6/test"/>

    <!-- 
    ///////////////////////////////////////////////////////////////////////////////////////
    //
    // Annotation properties
    //
    ///////////////////////////////////////////////////////////////////////////////////////
     -->

    <!-- http://purl.org/dc/dcam/domainIncludes -->

    <owl:AnnotationProperty rdf:about="http://purl.org/dc/dcam/domainIncludes"/>

    <!-- http://purl.org/dc/dcam/rangeIncludes -->

    <owl:AnnotationProperty rdf:about="http://purl.org/dc/dcam/rangeIncludes"/>

    <!-- http://purl.org/dc/elements/1.1/contributor -->

    <owl:AnnotationProperty rdf:about="http://purl.org/dc/elements/1.1/contributor"/>

    <!-- http://purl.org/dc/elements/1.1/coverage -->

    <owl:AnnotationProperty rdf:about="http://purl.org/dc/elements/1.1/coverage"/>

    <!-- http://purl.org/dc/elements/1.1/creator -->

    <owl:AnnotationProperty rdf:about="http://purl.org/dc/elements/1.1/creator"/>

...

In order to get back to the way it was, I would have to delete all of the dcterms listed in the rdf file and then, open the file in protege and reimport dcterms. This seems like something is going wrong somewhere. Maybe this is two different issues. The first being that dcterms isn't recognized as an ontology when it's imported and thus is given a generic name (the 'anonymous-#' name). Then separately, when loading an RDF file that has an owl:import of dcterms, that import gets lost and forgotten in protege. Either way something seems to be not working properly. I would love to be able to use dcterms regularly and just load it up from a file and not worry about this reimporting every other time I open protege.

Thank you!

ykazakov commented 1 month ago

Looks like there might be a problem with the RDF/XML parser. Try saving the ontology with the import in functional-style syntax:

Prefix(:=<http://www.semanticweb.org/demo/ontologies/2024/6/untitled-ontology-136/>)
Prefix(owl:=<http://www.w3.org/2002/07/owl#>)
Prefix(rdf:=<http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
Prefix(xml:=<http://www.w3.org/XML/1998/namespace>)
Prefix(xsd:=<http://www.w3.org/2001/XMLSchema#>)
Prefix(rdfs:=<http://www.w3.org/2000/01/rdf-schema#>)

Ontology(<http://www.semanticweb.org/demo/ontologies/2024/6/untitled-ontology-136>
Import(<http://purl.org/dc/terms/>)

)

For me, Protege 5.6.4 loads this file without loosing any import declarations. When saving in RDF/XML syntax, the imports are indeed lost and all axioms appear directly. This is probably an OWL API issue.

The strange OntologyID is probably because the DC Terms ontology is not an OWL ontology, and thus misses the OWL-specific entries. Did you try to download this ontology manually and looking at its content? The url http://purl.org/dc/terms/ does not lead to any file for me.

ignazio1977 commented 1 month ago

Attempting to load the ontology at http://purl.org/dc/terms/ currently fails with a 403.

org.semanticweb.owlapi.model.OWLRuntimeException: 
org.semanticweb.owlapi.io.OWLOntologyCreationIOException: Server returned HTTP response code: 403 for URL: http://dublincore.org/specifications/dublin-core/dcmi-terms/dublin_core_terms#
at org.semanticweb.owlapi.api.test.baseclasses.TestBase.loadOntology(TestBase.java:605)
at org.semanticweb.owlapi.api.test.imports.ImportsTestCase.shouldSaveAndLoadImport(ImportsTestCase.java:328)

Depending on Protege settings on how to handle unloadable imports, this might throw an error or be handled silently (with the effect that the import is ignored). That 403 looks like would depend on something having changed on the ontology side, rather than on OWLAPI code.

Re losing imports and having axioms added directly to the ontology, this is also controlled by options - one option when loading RDF files that are not ontologies is to create an anonymous ontology to hold the axioms; but, being anonymous, it can't be imported explicitly. The other option is to add all axioms to the importing ontology. From the report above, looks like both things have been tried?