owlcs / owlapi

OWL API main repository
822 stars 315 forks source link

OWLAPI parser ignoring catalog-v001.xml and failing to follow redirect when loading imported ontologies #1011

Closed rpgoldman closed 2 years ago

rpgoldman commented 3 years ago

I have had this issue when using OWLAPI indirectly through both Protege and HermiT. I have an ontology whose IRI is https://www.dropbox.com/s/s1e2dzw64m01f9n/container-ontology.ttl. When OWLAPI tries to load it as an imported ontology from another ontology, I get this error message (among others):

Parser: TurtleOntologyParser
uk.ac.manchester.cs.owl.owlapi.turtle.parser.ParseException: Encountered "" at line 1, column 1.
Was expecting one of:

--------------------------------------------------------------------------------

Looking at other error messages, it's clear that this is because OWLAPI is trying to parse the response page for a 301 (permanently relocated) status code. For example:

Parser: ManchesterOWLSyntaxOntologyParser
Encountered '<!DOCTYPE html><html class="maestro dig-Theme--VIS" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml"><head><script nonce="OXfqsdi8Bj21N993tQ/3">' at line 1 column 1.  Expected either 'Ontology:' or 'Prefix:' (Line 1)

I think there are two issues here:

  1. OWLAPI is not following the relocation link and loading the actual file.
  2. OWLAPI is ignoring the catalog-v001.html file in the current working directory:

    $ fgrep https://www.dropbox.com/s/s1e2dzw64m01f9n/container-ontology.ttl catalog-v001.xml
    <uri id="Imports Wizard Entry" name="https://www.dropbox.com/s/s1e2dzw64m01f9n/container-ontology.ttl" uri="container-ontology.ttl"/>

The catalog file is as follows:

catalog-v001.xml.txt

I am not a Java programmer, so I am not able to diagnose this much further myself. But if you could point me at some action I could take to test this and report back, I would be delighted to try. For example, if there is some direct way to invoke OWLAPI instead of going through Protege or HermiT, that would obviously be preferable.

Replication

To replicate, you can try loading the following ontology file:

strateos-catalog-individuals.ttl.txt

I got the failure doing the following:

java -jar ~/src/HermiT/HermiT.jar strateos-catalog-individuals.ttl

Adding the --verbose=3 flag did not get me any more information because the system tries multiple parsers and only errors out when all have failed.

Remarks on the 301

Connection #0 to host www.dropbox.com left intact Closing connection 0

ignazio1977 commented 2 years ago

The support for catalog files is not automatic - a mapper needs to be specified via code. HermiT doesn't do that, so this won't work from the command line tool for HermiT.

Protege supports catalog files and should be supplying the right mapping to OWLAPI (class OntologyLoader in Protege seems to be doing just that).

Can you mention the Protege and HermiT versions?

Following redirects has been supported for a while but there have been bugs fixed in relatively recent OWLAPI versions (4.5.17 and 4.5.18). Most likely Protege and HermiT are not using a recent enough version.

For catalog support outside Protege, OWLAPI supports catalog files if they are included in a zip file, i.e., the ontology files and the catalog file are zipped together. Then the following can be used:

    File file = new File("owlzipwithcatalog.zip");
    OWLZipClosureIRIMapper source = new OWLZipClosureIRIMapper(file);
    ontologyManager.getIRIMappers().add(source);

ontologyManager can then be used to load an ontology where the ontology IRI is remapped to a file name in the catalog file included in the zip file.