edmcouncil / ontology-publisher

The owl-builder "builds" publishable / deployable versions (and derived products) of a given set of ontologies (such as FIBO)
MIT License
7 stars 6 forks source link

Exception occurred while getting imported ontology No plugin registered for (text/plain, <class 'rdflib.parser.Parser'>) #156

Closed przemekgradzki closed 4 months ago

przemekgradzki commented 4 months ago

When executing the Merging all dev ontologies into one RDF file and Merging all prod ontologies into one RDF file steps, the following code is used to import ontologies (owl:imports):

https://github.com/edmcouncil/ontology-publisher/blob/c459b99b72f2831effde781f40e993836d622b26/publisher/lib/ontology_collector.py#L41

When loading remote files from servers (e.g. GitHub) that return a MIME type value in the Content-Type: field that is incorrect relative to the contents of the file, this steps results in an error with the following message:

Exception occurred while getting imported ontology No plugin registered for (text/plain, <class 'rdflib.parser.Parser'>)


This problem is visible in the example of the Friend of a Friend (FOAF) vocabulary ontology (rdf:about=http://xmlns.com/foaf/0.1/) contained in files located in various locations:

  1. Executing the following command:

    curl -L -v http://xmlns.com/foaf/spec/index.rdf 2>&1 >/dev/null | grep -i 'Content-Type:' | tail -n 1

    returns the correct value of the Content-Type: field for location http://xmlns.com/foaf/spec/index.rdf (see application/rdf+xml Media Type Registration):

    < Content-Type: application/rdf+xml

  2. .. while executing the command for the same ontology with the location on the GitHub server:

    curl -L -v https://github.com/LodLive/LodView/raw/master/src/main/webapp/WEB-INF/ontologies/foaf.rdf 2>&1 >/dev/null | grep -i 'Content-Type:' | tail -n 1

    returns incorrect value:

    < Content-Type: text/plain; charset=utf-8


A similar situation occurs for the ontology file http://purl.obolibrary.org/obo/bfo/2020/bfo.owl, whose actual location (HTTP/1.1 302 Found redirection occurs) is:

https://raw.githubusercontent.com/BFO-ontology/BFO-2020/release-2024-01-29/src/owl/profiles/temporal%20extensions/temporalized%20relations/owl/bfo-temporalized-relations.owl
przemekgradzki commented 4 months ago

Steps to reproduce (assuming python3 is installed):

# install "rdflib" library
pip install rdflib
# clone repository
git clone https://github.com/edmcouncil/ontology-publisher

# ---
# the following command
python3 ontology-publisher/publisher/lib/ontology_collector.py \
  --input_ontology https://github.com/LodLive/LodView/raw/master/src/main/webapp/WEB-INF/ontologies/foaf.rdf \
  --ontology-mapping <(echo '') \
  --output_ontology foaf.ttl

# returns empty "foaf.ttl" file and error:
#     Exception occurred while getting imported ontology No plugin registered for (text/plain, <class 'rdflib.parser.Parser'>)

# ---
# the following command
python3 ontology-publisher/publisher/lib/ontology_collector.py \
  --input_ontology http://xmlns.com/foaf/spec/index.rdf \
  --ontology-mapping <(echo '') \
  --output_ontology foaf.ttl

# returns "success" and non-emtpy "foaf.ttl" file