Define system capabilities

bkuczenski commented 5 years ago

The triple-store repo is supposed to generate (from scratch?) an operable triple store. From a reproducibility perspective, we want to know how to tell whether the outcome was successful.

what should the triple store contain when it's finished>
what queries should it support?
what should we expect the results to be?

In what other ways could we validate the work?

tmillross commented 5 years ago

Here's my attempt at answering those 3...

What should the triple store contain when it's finished?

By the end of the hackathon, the Bonsai triple store should contain all RDF-style data that should be persisted after the event.

That includes the following datasets, following the Bonsai schema(s)

Exiobase hybrid tables
ENTSO data
...other?

Each of these datasets should have been validated against the ontology. This validation should be automated, and should occur before the data is loaded into the database instance.

What queries should it support?

The Jena triple-store supports by default any valid SPARQL query.

Specific queries should be written and tested. These are based on the competency questions discussed in this thread and gathered/summarised here. The queries which have been written are here.

These queries will form a part of our testing pipeline for the database.

What should we expect the results to be?

The tests should be designed to give the same answers as those you would receive from the original datasource (e.g. Exiobase).

mmr2187 commented 5 years ago

I can start making a draft of queries for sparql and then modify it as the ontology goes along. I'll start looking back into the thread.

tmillross commented 5 years ago

@mmr2187 awesome work on the competency questions! I took the Word doc you shared and converted it into a markdown file in the root and a sub-folder containing the sparql queries you wrote.

BONSAMURAIS / triple-store