mmisw / mmiorr

Unmaintained old MMI ORR system (v2) -- New development at https://github.com/mmisw/orr
2 stars 1 forks source link

issues on upload of existing CSDMS ontology #338

Closed graybeal closed 9 years ago

graybeal commented 9 years ago

As a test, I tried to upload the ontology found at http://www.isi.edu/ikcap/geosoft/ontology/CSDMS.owl (after discussion with Scott Peckham). Several unexpected results occurred, some for obvious reasons. I've put all the issues here, since some may have been created by issue 1, the bad base.

I'd like to settle on the best way to produce a better load.

For easy reference, the beginning of the OWL file looks like

<?xml version="1.0"?>
<!DOCTYPE rdf:RDF [
    <!ENTITY owl "http://www.w3.org/2002/07/owl#" >
    <!ENTITY xsd "http://www.w3.org/2001/XMLSchema#" >
    <!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#" >
    <!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#" >
    <!ENTITY ts "http://www.isi.edu/ikcap/geosoft/ontology/software.owl#" >
]>

<rdf:RDF xmlns="http://www.isi.edu/ikcap/geosoft/ontology/CSDMS.owl#"
    xml:base="http://www.isi.edu/ikcap/geosoft/ontology/CSDMS.owl"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
    xmlns:ts="http://www.isi.edu/ikcap/geosoft/ontology/software.owl#"
    xmlns:owl="http://www.w3.org/2002/07/owl#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

    <owl:Ontology rdf:about="http://www.isi.edu/ikcap/geosoft/ontology/CSDMS.owl">
        <owl:imports rdf:resource="http://www.isi.edu/ikcap/geosoft/ontology/software.owl"/>
    </owl:Ontology>

    <!-- Objects -->
    <ts:Object rdf:ID="object_air">
        <rdfs:label>air</rdfs:label>
    </ts:Object>

1) The ontology got loaded with the full path above, including CSDMS.owl. This is perhaps reasonable given the file, and the problem can be assessed as poor ontology publication practice. (Specifically, I'm guessing making the base end in '.OWL' is bad practice.) Note that the OWL file has been produced by a tool. 2) I can't see a page for any term (that I tried) by appending '/' and the term ID. All the terms do show up in the ontology's page. 3) There are a very large number of terms (4892 Individuals, with many attributes), which makes the UI choppy/slow in displaying that dropdown list. 4) I thought the test for a test ontology is Testing in the name, but apparently it must be in the path? That is awkward when the ontology already exists. 5) Using the instructions in our technical details page, I tried to set the ontology to be a test ontology. The results appeared correct in the terminal window:

mysql> update ncbo_ontology_version set version_status='testing' where ontology_id='1690';
Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0

but the overview page doesn't show it as a testing ontology. Interesting, the Versions pop-up does show it as a testing ontology.

graybeal commented 9 years ago

This test is being performed in preparation for an initial discussion/review with Scott Peckham and his team; we don't have to find the perfect answer right away. So I can change the OWL file in whatever ways make the most sense for the demonstration, but we can expect multiple updates of the content over time. So we'll need to integrate our solution back into the production process.

carueda commented 9 years ago

1) I don't see any issue here, in particular if we are under the re-hosted mode of registration. This ontology's URI is http://www.isi.edu/ikcap/geosoft/ontology/CSDMS.owl:

<rdf:RDF xmlns="http://www.isi.edu/ikcap/geosoft/ontology/CSDMS.owl#"
    xml:base="http://www.isi.edu/ikcap/geosoft/ontology/CSDMS.owl"
    ...
    <owl:Ontology rdf:about="http://www.isi.edu/ikcap/geosoft/ontology/CSDMS.owl">

so it should be hosted as such. (Again, unless they want a fully-hosted entry in the ORR, in which case all entities under the http://www.isi.edu/ikcap/geosoft/ontology/CSDMS.owl# namespace would get "transferred" to, say, http://mmisw.org/ont/geosoft/csdms, depending on which authority and acronym are used.

2) This ontology uses # as the fragment separator

<rdf:RDF xmlns="http://www.isi.edu/ikcap/geosoft/ontology/CSDMS.owl#"

so use # instead of /. For example, the URI of the object_air entity is http://www.isi.edu/ikcap/geosoft/ontology/CSDMS.owl#object_air. With our access mechanisms:

http://mmisw.org/orr#http://www.isi.edu/ikcap/geosoft/ontology/CSDMS.owl#object_air

http://mmisw.org/ont?uri=http://www.isi.edu/ikcap/geosoft/ontology/CSDMS.owl%23object_air

Note, in this case # needs to be encoded, %23; otherwise it won't be seen by Ont.

3) yes, known issue. #156

4) It's always been in the path (specifically, the authority abbreviation part).

5) Perhaps it just took a little time to get reflected .. not sure. But I can see it as a testing entry.

graybeal commented 9 years ago

(1) Yes, we are re-hosting, and nothing is broken with our ingest. What I'm noticing is that we are not generating URIs that match our own description of how URIs are generated (with a / separator, and just using the authority). I recognize the motivations to match the namespace pattern, that is a clever catch. It just breaks expectations in our UI. (Because in at least one sense, our URI is not the URI of the namespace. And how will a user know which to use? But I see that's an almost unresolvable dilemma. I'll write a new issue with an idea, probably #339.)

(2) The other bit that was causing this problem is that not every token displayed in our drop-downs was being included in the final set of resolvable terms. It turns out (I'm guessing, based on a few tests) that the drop-downs include all the terms from the included ontologies ('software', in this case), but we only resolve the terms that are actually in the main ontology that's loaded. Again, a problem is that one can't tell the difference by looking at the UI. Will make issue #340.

(3) cool

(4) Ah. I will add #341 to add that information to the pop-up help for authority. (And also for acronym, that the acronym helps form the path of the term for local hosted ontologies.)

(5) Yes, likewise. Probably a cache issue, now that I think about it.

graybeal commented 9 years ago

All the open concerns have been allocated to other issues.

carueda commented 9 years ago

I'm not following all you are saying, but in short: if this ontology is for re-hosting, then the ORR should respect the URIs (that of the ontology itself, those of all entities, and actually everything within the ontology) exactly as given. A different aspect is about best practices for naming things, etc, which is, strictly speaking, outside of the tool itself.