ncbo / bioportal-project

Serves to consolidate (in Zenhub) all public issues in BioPortal
BSD 2-Clause "Simplified" License
7 stars 5 forks source link

MODSCI: fails to process #138

Open jvendetti opened 4 years ago

jvendetti commented 4 years ago

System is unable to process the MODSCI ontology. It passes the OWL API parsing step, but fails subsequently with the following stack trace:

E, [2019-10-08T14:30:54.945513 #6681] ERROR -- : ["Exception: Rapper cannot parse turtle file at /tmp/data_triple_store20191008-6681-1bkbecz: rapper: Parsing URI file:///tmp/data_triple_store20191008-6681-1bkbecz with parser turtle
rapper: Serializing with serializer ntriples
rapper: Error - URI file:///tmp/data_triple_store20191008-6681-1bkbecz:7 - syntax error at '<'
rapper: Parsing returned 6 triples

/srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/bundler/gems/goo-813f5c5c4e47/lib/goo/sparql/client.rb:59:in `bnodes_filter_file'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/bundler/gems/goo-813f5c5c4e47/lib/goo/sparql/client.rb:80:in `append_triples_no_bnodes'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/bundler/gems/goo-813f5c5c4e47/lib/goo/sparql/client.rb:121:in `append_data_triples'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/bundler/gems/goo-813f5c5c4e47/lib/goo/sparql/client.rb:147:in `append_triples'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/bundler/gems/ontologies_linked_data-57e648194224/lib/ontologies_linked_data/models/ontology_submission.rb:632:in `generate_missing_labels_pre'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/bundler/gems/ontologies_linked_data-57e648194224/lib/ontologies_linked_data/models/ontology_submission.rb:550:in `call'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/bundler/gems/ontologies_linked_data-57e648194224/lib/ontologies_linked_data/models/ontology_submission.rb:550:in `block (2 levels) in loop_classes'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/bundler/gems/ontologies_linked_data-57e648194224/lib/ontologies_linked_data/models/ontology_submission.rb:504:in `block in process_callbacks'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/bundler/gems/ontologies_linked_data-57e648194224/lib/ontologies_linked_data/models/ontology_submission.rb:500:in `delete_if'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/bundler/gems/ontologies_linked_data-57e648194224/lib/ontologies_linked_data/models/ontology_submission.rb:500:in `process_callbacks'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/bundler/gems/ontologies_linked_data-57e648194224/lib/ontologies_linked_data/models/ontology_submission.rb:549:in `block in loop_classes'
    /usr/local/rbenv/versions/2.5.6/lib/ruby/2.5.0/benchmark.rb:308:in `realtime'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/bundler/gems/ontologies_linked_data-57e648194224/lib/ontologies_linked_data/models/ontology_submission.rb:531:in `loop_classes'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/bundler/gems/ontologies_linked_data-57e648194224/lib/ontologies_linked_data/models/ontology_submission.rb:1002:in `process_submission'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:177:in `process_submission'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:47:in `block in process_queue_submissions'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:41:in `each'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:41:in `process_queue_submissions'
    /srv/ncbo/ncbo_cron/bin/ncbo_cron:240:in `block (3 levels) in <main>'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/scheduler.rb:65:in `block (3 levels) in scheduled_locking_job'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/scheduler.rb:51:in `fork'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/scheduler.rb:51:in `block (2 levels) in scheduled_locking_job'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/gems/mlanett-redis-lock-0.2.7/lib/redis-lock.rb:43:in `lock'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/gems/mlanett-redis-lock-0.2.7/lib/redis-lock.rb:234:in `lock'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/scheduler.rb:50:in `block in scheduled_locking_job'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-2.0.24/lib/rufus/sc/jobs.rb:230:in `trigger_block'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-2.0.24/lib/rufus/sc/jobs.rb:204:in `block in trigger'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.5.0/gems/rufus-scheduler-2.0.24/lib/rufus/sc/scheduler.rb:430:in `block in trigger_job'"]
jvendetti commented 4 years ago

As an initial troubleshooting step, I attempted to open the ontology in Protege (v 5.5.0). The Protege log file displayed many axiom transform errors:

ERROR  16:29:04  Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {<http://www.w3.org/2008/05/skos#prefLabel>=<http://www.w3.org/2008/05/skos#prefLabel>, <http://www.w3.org/2008/05/skos#hiddenLabel>=<http://www.w3.org/2008/05/skos#hiddenLabel>, <http://www.w3.org/2008/05/skos#definition>=<http://www.w3.org/2008/05/skos#definition>, <http://www.w3.org/2008/05/skos#altLabel>=<http://www.w3.org/2008/05/skos#altLabel>}, axiom: AnnotationAssertion(<http://www.w3.org/2008/05/skos#definition> <http://www.w3.org/2008/05/skos#historyNote> "A note about the past state/use/meaning of a concept."@en), error: uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplPlain cannot be cast to org.semanticweb.owlapi.model.IRI

Screenshot 2019-10-08 16 30 44

Contacted the author and asked if he could resolve the errors and resubmit. The author replied that he had fixed the errors and resubmitted the ontology. The re-submission still failed to process, so I downloaded the new version and found the errors were still present in Protege.

I contacted the author a second time and asked him to send me a copy of the "fixed" ontology privately in the event something had gone wrong with the re-submission. I received a response with a private copy of the ontology, but again - the errors were still present in Protege.

I then corresponded with Rafael from the Protege team to see if he could tell me the source of the errors in Protege. His response:

I think the problem is that the ontology imports the skos and foaf vocabularies explicitly. Generally, there is no need to do that. One can just use whatever property from skos (e.g., prefLabel) or from foaf. By removing those 2 imported ontologies, the ModSci ontology loads without errors/warnings in Protege Same way one doesn’t have to import RDFS vocabulary to use rdfs:label

I removed the skos and foaf vocabulary imports as Rafael recommended and uploaded the modified version. However, the ontology still fails to process with the same exception as above.

jvendetti commented 4 years ago

More troubleshooting: the rapper RDF parsing utility reports no errors converting from RDF/XML format to N-Triples:

[ncbo-deployer@ncbo-prd-app-31 3]$ pwd
/srv/ncbo/repository/MODSCI/3
[ncbo-deployer@ncbo-prd-app-31 3]$ rapper -i rdfxml -o ntriples owlapi.xrdf > data.triples
rapper: Parsing URI file:///srv/ncbo/share/env/production/repository/MODSCI/3/owlapi.xrdf with parser rdfxml
rapper: Serializing with serializer ntriples
rapper: Parsing returned 1132 triples
[ncbo-deployer@ncbo-prd-app-31 3]$ rapper -i ntriples -c data.triples
rapper: Parsing URI file:///srv/ncbo/share/env/production/repository/MODSCI/3/data.triples with parser ntriples
rapper: Parsing returned 1132 triples
jvendetti commented 4 years ago

More troubleshooting: wrote a snippet of OWL API code to convert the RDF/XML to N-Triples format. Just like the rapper utility, the OWL API throws no errors during the conversion.

File xrdfFile = new File("/Users/jvendetti/Development/Examples/ontologies/modsci/owlapi.xrdf");
OWLOntologyManager manager = OWLManager.createOWLOntologyManager();
OWLOntology ontology = manager.loadOntologyFromOntologyDocument(xrdfFile);
File nTriplesFile = new File("/Users/jvendetti/Development/Examples/ontologies/modsci/owlapi.ntriples");
manager.saveOntology(ontology, new NTriplesDocumentFormat(), IRI.create(nTriplesFile));