INCATools / semantic-sql

SQL and SQLite builds of OWL ontologies
https://incatools.github.io/semantic-sql/
BSD 3-Clause "New" or "Revised" License
37 stars 3 forks source link

Sometimes creates `X.db.tmp` but not `X.db` #67

Closed joeflack4 closed 1 year ago

joeflack4 commented 1 year ago

Overview

I'm trying to get mondo-ingest to handle creation of semsql databases, but I get an error towards the end of the conversion.

Errors

Here's an example of what happened when I tried to do the conversion for ORDO.

Short err

thread 'main' panicked at 'called ``Result::unwrap()`` on an ``Err`` value: RdfXmlError { kind: Xml(EscapeError(UnrecognizedSymbol(1..4, Ok("xsd")))) }', src/[main.rs](http://main.rs/):102:8

Long err

Below is the relevant part of the overall stacktrace, with RUST_BACKTRACE=1. Here is the full stacktrace of the make goal with RUST_BACKTRACE=full: stacktrace.txt

cp .template.db mirror/ordo.db.tmp && \
rdftab mirror/ordo.db.tmp < mirror/ordo.owl && \
sqlite3 mirror/ordo.db.tmp -cmd '.separator "\t"' ".import mirror/ordo-relation-graph.tsv entailed_edge" && \
gzip -f mirror/ordo-relation-graph.tsv && \
cat /usr/local/lib/python3.10/dist-packages/semsql/builder//indexes/*.sql | sqlite3 mirror/ordo.db.tmp && \
mv mirror/ordo.db.tmp mirror/ordo.db
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: RdfXmlError { kind: Xml(EscapeError(UnrecognizedSymbol(1..4, Ok("xsd")))) }', src/[main.rs](http://main.rs/):102:8

stack backtrace:
   0: rust_begin_unwind
             at /build/rustc-ZOqcvC/rustc-1.61.0+dfsg1/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /build/rustc-ZOqcvC/rustc-1.61.0+dfsg1/library/core/src/panicking.rs:143:14
   2: core::result::unwrap_failed
             at /build/rustc-ZOqcvC/rustc-1.61.0+dfsg1/library/core/src/result.rs:1785:5
   3: rdftab::main

Reproducibility

ORDO chosen here arbitrarily.

  1. Have docker running and an updated version of obolibrary/odkfull dev.
  2. From updated main on a clone of mondo-ingest, from src/ontology/, run: docker run -w /work -v PATH/TO/mondo-ingest/src/ontology:/work --rm obolibrary/odkfull:dev semsql make mirror/ordo.db

Additional information

Related: https://github.com/monarch-initiative/mondo-ingest/issues/136

Looks like this error is occurring in rdftab, but I'm not sure why.

In mondo-ingest, this problem happens for the following ontologies:

Doesn't happen for these:

cmungall commented 1 year ago

This isn't an issue with semsql. Your rdf xml is invalid. Unfortunately the error reporting is not helpful. Try Jena validate. See Jim's advice here https://github.com/INCATools/ontology-development-kit/issues/691

joeflack4 commented 1 year ago

@cmungall Thanks Chris, I will give it a look.

There was something near the start of the trace regarding something else erroneous/invalid; I forgot to highlight that. I'm guessing that is probably what's tripping it up.