casework / CASE-Utility-RDFDiff

Diff tool for comparing RDF schemas/ontologies; can be applied to both CASE and UCO.
Apache License 2.0
6 stars 1 forks source link
diff diffing graph-comparison graphs ontology rdf rdf-graph verification

Cyber-investigation Analysis Standard Expression (CASE)

Read the CASE Wiki tab to learn everything you need to know about the Cyber-investigation Analysis Standard Expression (CASE) ontology. For learning about the Unified Cyber Ontology, CASE's parent, see UCO.

RDFDiff

An RDF and ontology trouble shooter for CASE and UCO.

What it does

RDFDiff takes output of a tool (JSON/JSON-LD/XML/etc...) and attempts to validate it against a RDF based ontology (OWL/N3/ttl). Any entry in the tool's output that is NOT in the Ontology (specified via CLI) will display an error. CLI arugments can be added to cause a debugger to start so you may explore the graph to view where things went wrong.

How it works

rdfdiff.py will read in an RDF vocabulary defined via -g (glossary) to check a custom tool's output against via -i. The Python library rdflib is used to turn the RDF schema into tripples which are then broken into three lists; subject, predicate and object. Finally, each element of the tools ouput within the tool's subject and predicate are checked againast the glossary's subject and predicate to confirm the existence of these RDF elements. If an element is found or not found it is displayed to the user. BNodes are skipped as they have no appropriate label in RDF and should not be used to verify an ontology.

Why not SPARQL?

In order to facilitate a broad range of ontologies and custom tool outputs, SPARQL queries are not used for verification. CASE and UCO allow for robust flexibility and this tool aims to compliment this approach.

Installation

sudo pip install -r requirements.txt 

Unit Tests

The ontology validator is heavily reliant on 3rd pary libraries, primarily RDFlib. RDFLib is under heavy development. To ensure compatability with new releases unit tests have been written to check for consistency.

Run unit tests:

cd tests;
python test_verifier.py;

CLI Usage

CLI Example

rdfdiff.py -g case.ttl -gf turtle -i output.json-ld -if json-ld --verify=1

* Print all graphs for tool's schema.

rdfdiff.py -g case.ttl -gf turtle -i output.json-ld -if json-ld -tg=1


* Print all graphs for ontology's schema.

rdfdiff.py -g case.ttl -gf turtle -i output.json-ld -if json-ld -gg=1



# I have a question!

Before you post a Github issue or send an email ensure you've done this checklist:

1. [Determined scope](https://caseontology.org/ontology/start.html#scope) of your task. It is not necessary for most parties to understand all aspects of the ontology, mapping methods, and supporting tools.

2. Familiarize yourself with the [labels](https://github.com/casework/RDFDiff/labels) and search the [Issues tab](https://github.com/casework/RDFDiff/issues). Typically, only light-blue and red labels should be used by non-admin Github users while the others should be used by CASE Github admins.
*All but the red `Project` labels are found in every [`casework`](https://github.com/casework) repository.*

3. If/when you run into an issue with a given RDF schema format or the verifier.py script, please open an issue with the error and as much technical detail as you can provide.