iptc / sport-schema

The next generation of sports data, based on IPTC’s SportsML and semantic web principles
12 stars 1 forks source link

IPTC Sport Schema

The next generation of sports data, based on IPTC's SportsML and semantic web principles.

About the IPTC Sport Schema

The IPTC is proud to present the result of years of work: an RDF-based data model covering sports schedules, event results and statistics.

Our development process is based on a set of use cases documented at https://github.com/iptc/sport-model/wiki/Use-Cases .

We have created an RDFS/OWL ontology that represents schedules, statistics and results for all levels of all sports, for both human and machine consumption.

Project goals and principles

We want the resulting data model and vocabulary to be:

Repository layout

docs:

HTML documentation published to sportschema.org using GitHub Pages. Includes ontology documentation under docs/ontologies/.

The docs use the Jekyll documentation generation system. To run a local server, run bundle exec jekyll serve

queries:

Sample SPARQL queries exercising some of the use cases.

queries/output:

Expected output from each of the sample SPARQL queries.

samples/{n3|ttl|jsonld}/:

Example data files in RDF N3, Turtle (.ttl) and JSON-LD formats.

samples/xml/sportsml:

Examples in SportsML, to be converted to N3 using the convert-sportsml-to-rdf.sh script in the tools folder.

tools:

We have created a Bash script which uses the Saxon XSLT processor to convert SportsML example files in XML to N3 triples and then uses Apache Jena to convert the N3 to the more readable Turtle (TTL) and JSON-LD formats.

This repository contains the converted files, but if you need to convert them again, simply run:

tools/convert-examples-to-rdf.sh

If you want to try converting an individual N3 file yourself, you can use Jena's riot tool directly:

riot --formatted=TURTLE tools/prefixes.ttl samples/n3/soccer-match-01.n3
riot --formatted=JSON-LD samples/ttl/soccer-match-01.ttl

Note that the JSON-LD files include the @context section at the bottom of the file, whereas most JSON-LD examples include @context at the top. This is an artefact of the Jena JSON-LD generator and doesn't affect the usefulness of the data files.

To run the test queries and compare them against the expected output, run

tools/run-test-queries.sh

This will run each of the example SPARQL queries in the queries folder against all the data files in the samples/ttl folder. It compares the output of the SPARQL queries against the corresponding file in the queries/output folder. If there are any discrepancies, they will be displayed inline.

We have also created a test that runs using a local instance of the Fuseki server, comparing the results against the queries/fuseki-output folder.

tools/run-test-queries-fuseki.sh

More detailed documentation

More documentation is available at https://www.sportschema.org/