ncbo / bioportal-project

Serves to consolidate (in Zenhub) all public issues in BioPortal
BSD 2-Clause "Simplified" License
7 stars 5 forks source link

HUSAT ontology didn't complete processing #247

Closed graybeal closed 2 years ago

graybeal commented 2 years ago

HUSAT didn't process properly last night. It looks like the ontology parsed OK but something else went south with process or file (b)locking. Details from log below.

# Logfile created on 2022-06-21 11:19:06 -0700 by logger.rb/v1.5.1
I, [2022-06-21T11:19:06.329631 #14611]  INFO -- : ["Starting to process http://data.bioontology.org/ontologies/HUSAT/submissions/1"]
I, [2022-06-21T11:19:06.356073 #14611]  INFO -- : ["Starting to process HUSAT/submissions/1"]
I, [2022-06-21T11:19:06.502218 #14611]  INFO -- : ["Java call [java -DentityExpansionLimit=2500000 -Xmx10240M -jar /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/bundler/gems/ontologies_linked_data-4b4211cc7c61/bin/owlapi-wrapper-1.3.8.jar -m /srv/ncbo/repository/HUSAT/1/samples-voc-additions.ttl -o /srv/ncbo/repository/HUSAT/1 -r true]"]
I, [2022-06-21T11:19:07.800500 #14611]  INFO -- : ["2022-06-21T11:19:06 [main] INFO  o.s.n.o.OntologyParserCommand - Parsing invocation with values: ParserInvocation [inputRepositoryFolder=null, outputRepositoryFolder=/srv/ncbo/repository/HUSAT/1, masterFileName=/srv/ncbo/repository/HUSAT/1/samples-voc-additions.ttl, invocationId=0, parserLog=, userReasoner= true]

2022-06-21T11:19:06 [main] INFO  o.s.ncbo.oapiwrapper.OntologyParser - executor ...

2022-06-21T11:19:07 [main] INFO  o.s.ncbo.oapiwrapper.OntologyParser - Input repository folder is null. Unique file being parsed.

2022-06-21T11:19:07 [main] DEBUG o.e.rdf4j.rio.RDFParserRegistry - Registered service class org.eclipse.rdf4j.rio.binary.BinaryRDFParserFactory

2022-06-21T11:19:07 [main] DEBUG o.e.rdf4j.rio.RDFParserRegistry - Registered service class org.eclipse.rdf4j.rio.n3.N3ParserFactory

2022-06-21T11:19:07 [main] DEBUG o.e.rdf4j.rio.RDFParserRegistry - Registered service class org.eclipse.rdf4j.rio.nquads.NQuadsParserFactory

2022-06-21T11:19:07 [main] DEBUG o.e.rdf4j.rio.RDFParserRegistry - Registered service class org.eclipse.rdf4j.rio.ntriples.NTriplesParserFactory

2022-06-21T11:19:07 [main] DEBUG o.e.rdf4j.rio.RDFParserRegistry - Registered service class org.eclipse.rdf4j.rio.rdfjson.RDFJSONParserFactory

2022-06-21T11:19:07 [main] DEBUG o.e.rdf4j.rio.RDFParserRegistry - Registered service class org.eclipse.rdf4j.rio.jsonld.JSONLDParserFactory

2022-06-21T11:19:07 [main] DEBUG o.e.rdf4j.rio.RDFParserRegistry - Registered service class org.eclipse.rdf4j.rio.rdfxml.RDFXMLParserFactory

2022-06-21T11:19:07 [main] DEBUG o.e.rdf4j.rio.RDFParserRegistry - Registered service class org.eclipse.rdf4j.rio.trix.TriXParserFactory

2022-06-21T11:19:07 [main] DEBUG o.e.rdf4j.rio.RDFParserRegistry - Registered service class org.eclipse.rdf4j.rio.turtle.TurtleParserFactory

2022-06-21T11:19:07 [main] DEBUG o.e.rdf4j.rio.RDFParserRegistry - Registered service class org.eclipse.rdf4j.rio.trig.TriGParserFactory

2022-06-21T11:19:07 [main] DEBUG o.e.r.rio.DatatypeHandlerRegistry - Registered service class org.eclipse.rdf4j.rio.datatypes.XMLSchemaDatatypeHandler

2022-06-21T11:19:07 [main] DEBUG o.e.r.rio.DatatypeHandlerRegistry - Registered service class org.eclipse.rdf4j.rio.datatypes.RDFDatatypeHandler

2022-06-21T11:19:07 [main] DEBUG o.e.r.rio.DatatypeHandlerRegistry - Registered service class org.eclipse.rdf4j.rio.datatypes.DBPediaDatatypeHandler

2022-06-21T11:19:07 [main] DEBUG o.e.r.rio.DatatypeHandlerRegistry - Registered service class org.eclipse.rdf4j.rio.datatypes.VirtuosoGeometryDatatypeHandler

2022-06-21T11:19:07 [main] DEBUG o.e.r.rio.DatatypeHandlerRegistry - Registered service class org.eclipse.rdf4j.rio.datatypes.GeoSPARQLDatatypeHandler

2022-06-21T11:19:07 [main] DEBUG o.e.r.rio.LanguageHandlerRegistry - Registered service class org.eclipse.rdf4j.rio.languages.RFC3066LanguageHandler

2022-06-21T11:19:07 [main] DEBUG o.e.r.rio.LanguageHandlerRegistry - Registered service class org.eclipse.rdf4j.rio.languages.BCP47LanguageHandler

2022-06-21T11:19:07 [main] INFO  o.s.ncbo.oapiwrapper.OntologyMetrics - Calculating metrics for /srv/ncbo/repository/HUSAT/1/samples-voc-additions.ttl

2022-06-21T11:19:07 [main] INFO  o.s.ncbo.oapiwrapper.OntologyMetrics - Finished metrics calculation for /srv/ncbo/repository/HUSAT/1/samples-voc-additions.ttl in 1 milliseconds

2022-06-21T11:19:07 [main] INFO  o.s.ncbo.oapiwrapper.OntologyMetrics - Generated metrics CSV file for /srv/ncbo/repository/HUSAT/1/samples-voc-additions.ttl

2022-06-21T11:19:07 [main] INFO  o.s.ncbo.oapiwrapper.OntologyParser - Ontology document format: org.semanticweb.owlapi.formats.OBODocumentFormat

2022-06-21T11:19:07 [main] INFO  o.s.ncbo.oapiwrapper.OntologyParser - isPrefixOWLOntologyFormat: false

2022-06-21T11:19:07 [main] INFO  o.s.ncbo.oapiwrapper.OntologyParser - isOBO: true

2022-06-21T11:19:07 [main] INFO  o.s.ncbo.oapiwrapper.OntologyParser - Serializing ontology in RDF ...

2022-06-21T11:19:07 [main] INFO  o.s.ncbo.oapiwrapper.OntologyParser - Serialization done!

2022-06-21T11:19:07 [main] INFO  o.s.n.o.OntologyParserCommand - Parse result: true

2022-06-21T11:19:07 [main] INFO  o.s.n.o.OntologyParserCommand - Output triples in: {}/srv/ncbo/repository/HUSAT/1/owlapi.xrdf

2022-06-21T11:19:07 [main] INFO  o.s.n.o.OntologyParserCommand - Finished parsing!
"]
I, [2022-06-21T11:19:07.800701 #14611]  INFO -- : ["OWLAPI Java command: parsing finished successfully."]
I, [2022-06-21T11:19:07.804790 #14611]  INFO -- : ["Output size 51556 in `/srv/ncbo/repository/HUSAT/1/owlapi.xrdf`"]
E, [2022-06-21T11:19:08.063118 #14611] ERROR -- : ["RestClient::BadRequest: 400 Bad Request
/srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/gems/rest-client-2.1.0/lib/restclient/abstract_response.rb:249:in `exception_with_response'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/gems/rest-client-2.1.0/lib/restclient/abstract_response.rb:129:in `return!'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/gems/rest-client-2.1.0/lib/restclient/request.rb:836:in `process_result'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/gems/rest-client-2.1.0/lib/restclient/request.rb:743:in `block in transmit'
    /usr/local/rbenv/versions/2.7.6/lib/ruby/2.7.0/net/http.rb:933:in `start'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/gems/rest-client-2.1.0/lib/restclient/request.rb:727:in `transmit'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in `execute'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in `execute'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/bundler/gems/goo-b7874b1fd4a3/lib/goo/sparql/client.rb:116:in `append_triples_no_bnodes'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/bundler/gems/goo-b7874b1fd4a3/lib/goo/sparql/client.rb:141:in `put_triples'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/bundler/gems/ontologies_linked_data-4b4211cc7c61/lib/ontologies_linked_data/models/ontology_submission.rb:1536:in `delete_and_append'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/bundler/gems/ontologies_linked_data-4b4211cc7c61/lib/ontologies_linked_data/models/ontology_submission.rb:475:in `generate_rdf'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/bundler/gems/ontologies_linked_data-4b4211cc7c61/lib/ontologies_linked_data/models/ontology_submission.rb:973:in `process_submission'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:177:in `process_submission'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:47:in `block in process_queue_submissions'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:41:in `each'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:41:in `process_queue_submissions'
    /srv/ncbo/ncbo_cron/bin/ncbo_cron:270:in `block (3 levels) in <main>'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/scheduler.rb:65:in `block (3 levels) in scheduled_locking_job'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/scheduler.rb:51:in `fork'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/scheduler.rb:51:in `block (2 levels) in scheduled_locking_job'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/gems/mlanett-redis-lock-0.2.7/lib/redis-lock.rb:43:in `lock'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/gems/mlanett-redis-lock-0.2.7/lib/redis-lock.rb:234:in `lock'
    /srv/ncbo/ncbo_cron/lib/ncbo_cron/scheduler.rb:50:in `block in scheduled_locking_job'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/gems/rufus-scheduler-2.0.24/lib/rufus/sc/jobs.rb:230:in `trigger_block'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/gems/rufus-scheduler-2.0.24/lib/rufus/sc/jobs.rb:204:in `block in trigger'
    /srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.7.0/gems/rufus-scheduler-2.0.24/lib/rufus/sc/scheduler.rb:430:in `block in trigger_job'"]

I've asked the submitter to resubmit with a slightly improved version, as a possible fast way to resolve the problem (or get more data on it). Will notify if that doesn't work…

jvendetti commented 2 years ago

This error look similar to what I see when the OWL API successfully loads an ontology, but the triplestore won't accept the triples. I'm not a SKOS expert, but it's not evident to me that you can declare concepts using this syntax:

<https://purl.org/hubmapvoc/samples-voc-additions/-80Celsius-Cryotube> a skos:Concept;
...

Using the rapper command line utility shows:

$ pwd
/srv/ncbo/repository/HUSAT/1

$ rapper -i rdfxml -o ntriples owlapi.xrdf > data.triples
rapper: Parsing URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/owlapi.xrdf with parser rdfxml
rapper: Serializing with serializer ntriples
rapper: Parsing returned 331 triples

$ rapper -i ntriples -c data.triples
rapper: Parsing URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples with parser ntriples
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:128 column 61 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:129 column 61 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:130 column 61 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:131 column 61 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:132 column 61 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:133 column 61 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:134 column 61 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:135 column 61 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:159 column 52 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:160 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:161 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:162 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:163 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:164 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:165 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:166 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:167 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:168 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:169 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:170 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:171 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:172 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:173 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:174 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:175 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:176 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:177 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:178 column 59 - URI error - illegal Unicode escape \u0020 in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:305 column 90 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:306 column 90 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:307 column 90 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:308 column 90 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:309 column 90 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:310 column 90 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:311 column 90 - URI error - illegal Unicode escape \u003C in URI.
rapper: Error - URI file:///srv/ncbo/share/env/production/repository/HUSAT/1/data.triples:312 column 90 - URI error - illegal Unicode escape \u003C in URI.
rapper: Parsing returned 331 triples
graybeal commented 2 years ago

oh hey, you beat me to it! yes, I think it wasn't the format of that line but it was the % (and/or ^) values in the IRIs. I forgot to send her a modification to get rid of those before I left, and when we resubmitted with those eliminated it submitted just fine.

Protege is so forgiving it didn't care, but there was one error that I think was unrelated. Protege log for the original file is copied below. If

Cannot generate ontology catalog for ontology at http://www.semanticweb.org/jgraybeal/ontologies/2022/5/27/untitled-ontology-151. URI scheme is not "file"`

is a 'happens all the time' error than Protege didn't see any issues with this file at all.

INFO  14:16:45  ------------------------------------ Protege -----------------------------------
   INFO  14:16:45  Protege Desktop
   INFO  14:16:45  Version 5.5.0, Build 
   INFO  14:16:45  
   INFO  14:16:45  
   INFO  14:16:45  ----------------------------------- Platform -----------------------------------
   INFO  14:16:45  Java: JVM 1.8.0_121-b13  Memory: 2796M
   INFO  14:16:45  Language: en, Country: US
   INFO  14:16:45  Framework: Apache Software Foundation (1.8) 
   INFO  14:16:45  OS: macosx (10.16)
   INFO  14:16:45  Processor: x86-64

   INFO  14:16:45  
   INFO  14:16:45  ------------------------------------ Plugins -----------------------------------
   INFO  14:16:45  Plugin: OWLAPI RDF Library (3.0.0)
   INFO  14:16:45  Plugin: SPARQL Query Plugin (3.0.0)
   INFO  14:16:45  Plugin: Existential Query (2.0.0)
   INFO  14:16:45  Plugin: Explanation Workbench (3.0.0)
   INFO  14:16:45  Plugin: OntoGraf (2.0.3)
   INFO  14:16:45  Plugin: Browser View (OWLDoc) (3.0.3)
   INFO  14:16:45  Plugin: OWLViz (5.0.3)
   INFO  14:16:45  Plugin: HermiT Reasoner (1.4.3.456)
   INFO  14:16:45  Plugin: Cellfie Protege 5.0+ Plugin (2.1.0)
   INFO  14:16:45  Plugin: DL Query (4.0.1)
   INFO  14:16:45  Plugin: SWRLTab Protege 5.0+ Plugin (2.0.6)
   INFO  14:16:45  Plugin: OWL Code Generation Plug-in (2.0.0)
   INFO  14:16:45  
   INFO  14:16:45  Creating and setting up empty (default) editor kit
   INFO  14:16:45  Received request to edit document at file:/Users/jgraybeal/Downloads/samples-voc-additions.ttl
   INFO  14:16:45  Application is initialized.  Opening URI.
   INFO  14:16:45  Creating and setting up (default) editor kit for file:/Users/jgraybeal/Downloads/samples-voc-additions.ttl
   INFO  14:16:46  OWL API Version: 4.5.9.2019-02-01T07:24:44Z
   INFO  14:16:46  Cannot generate ontology catalog for ontology at http://www.semanticweb.org/jgraybeal/ontologies/2022/5/27/untitled-ontology-151. URI scheme is not "file"
   INFO  14:16:47  ------------------------------- Auto-update Check ------------------------------
   INFO  14:16:47  OWL API Version: 4.5.9.2019-02-01T07:24:44Z
   INFO  14:16:47  Auto-update is disabled
   INFO  14:16:47  
   INFO  14:16:47  ------------------------------- Loading Ontology -------------------------------
   INFO  14:16:47  Loading ontology from file:/Users/jgraybeal/Downloads/samples-voc-additions.ttl
   INFO  14:16:47  Finished loading file:/Users/jgraybeal/Downloads/samples-voc-additions.ttl
   INFO  14:16:47  Loading for ontology and imports closure successfully completed in 370 ms
   INFO  14:16:47  Updated document format class from: org.semanticweb.owlapi.formats.RioTurtleDocumentFormat to: org.semanticweb.owlapi.formats.TurtleDocumentFormat
   INFO  14:16:47  
   INFO  14:16:47  ---------------------------- Disposing of Workspace ----------------------------
   INFO  14:16:47  Saved tab state for 'DL Query' tab
   INFO  14:16:47  Saved tab state for 'Active ontology' tab
   INFO  14:16:47  Saved tab state for 'Entities' tab
   INFO  14:16:47  Saved tab state for 'Individuals by class' tab
   INFO  14:16:47  Saved workspace
   INFO  14:16:47  Disposed of 'DL Query' tab
   INFO  14:16:47  Disposed of 'Active ontology' tab
   INFO  14:16:47  Disposed of 'Entities' tab
   INFO  14:16:47  Disposed of 'Individuals by class' tab
   INFO  14:16:47  Disposed of workspace
   INFO  14:16:47  
graybeal commented 2 years ago

There were some really interesting cacheing issues I was seeing while working on this submission, but I'll see if I can find a cacheing ticket to report them to.

graybeal commented 2 years ago

Closed because second submission of a better-formatted file fixed the issue.