π₯ Access the web app at index.semanticscience.org
π¬ Query our knowledge graph using the OpenAPI at grlc.io/api-git/vemonet/shapes-of-you/subdir/api (powered by grlc.io and SPARQL)
β¨ Directly query the SPARQL endpoint on YASGUI at https://graphdb.dumontierlab.com/repositories/shapes-registry.
The SPARQL endpoint is also conveniently accessible in the webapp Active endpoints tab, since Shapes of You indexes its own SPARQL query files, and computes metadata for its SPARQL endpoint.
Shapes of you is a global index for semantically descriptive files published to public Git repositories (GitHub, GitLab, and Gitee), it enables semantic web enthusiast to connect those standard knowledge definitions to active Linked Open Data access points (SPARQL endpoints).
To be found by our indexer, make sure your repository description, or topics, on GitHub, GitLab, or Gitee includes one of the resources mentionned below, we automatically index files from public repositories every week on saturday at 1:00 GMT+1 π
.ttl
, .rdf
, .jsonld
, etc), with all sh:NodeShape
they contain.shex
files, and ShEx shapes defined in RDF files.rq
and .sparql
files, and parse grlc.io APIs metadataowl:Class
they containskos:Concept
they containr2rml:SubjectMap
and rml:LogicalSource
they containr2rml:SubjectMap
they containcsvw:Column
they containnt:AssertionTemplates
and inputs they contain.obo
files with all terms they contain.yml
, .yaml
and .json
files, and parse the spec to retrieve API metadatadcat:Dataset
they containIf your repository or endpoint is missed by our indexer:
Additional GitHub repositories in the file EXTRAS_GITHUB_REPOSITORIES.txt
Additional SPARQL endpoints in the file EXTRAS_SPARQL_ENDPOINTS.txt
This web service is composed of those 4 main parts, described more in details below:
main
branch.We defined and published a simple schema for our data as a OWL ontology, mainly re-using schema.org concepts.
Checkout the OWL ontology in website/assets/shapes-of-you-ontology.ttl
π¦
Here is an overview of the ontology (generated by gra.fo):
Just copy/paste this if you are missing some prefixes to query the Shapes of You knowledge graph:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX sio: <http://semanticscience.org/resource/SIO_>
PREFIX schema: <https://schema.org/>
PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX shex: <http://www.w3.org/ns/shex#>
PREFIX void: <http://rdfs.org/ns/void#>
PREFIX void-ext: <http://ldf.fi/void-ext#>
PREFIX sdm: <https://w3id.org/vocab/sdm#>
PREFIX r2rml: <http://www.w3.org/ns/r2rml#>
PREFIX rml: <http://semweb.mmlab.be/ns/rml#>
PREFIX nt: <https://w3id.org/np/o/ntemplate/>
PREFIX csvw: <http://www.w3.org/ns/csvw#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
schema:SoftwareSourceCode
dcterms:hasPart
rdfs:comment
schema:codeRepository
> schema:DataCatalog
sh:Shape
(SHACL shape)shex:Schema
(ShEX schema)sh:SPARQLFunction
(SPARQL query) - additional properties: void:sparqlEndpoint
, schema:query
owl:Ontology
(OWL ontology)skos:ConceptScheme
(SKOS vocabulary)sio:000623
(OBO ontology)schema:APIReference
(OpenAPI)rml:LogicalSource
(RML and YARRRML mappings)r2rml:TriplesMap
(R2RML mappings)nt:AssertionTemplate
(Nanopublication templates)dcat:Dataset
(DCAT datasets)schema:DataCatalog
rdfs:comment
schema:EntryPoint
Requirements: npm and yarn installed.
Clone the repository:
git clone https://github.com/vemonet/shapes-of-you
cd shapes-of-you
Install dependencies :inbox_tray:
yarn
Run the web app on http://localhost:19006, it should reload automatically at each changes to the code :arrows_clockwise:
yarn dev
Upgrade the packages versions in yarn.lock
π
yarn upgrade
This website is automatically deployed by a GitHub Actions workflow to GitHub Pages which is accessed from http://index.semanticscience.org
You can also build locally in the /web-build
folder and serve on http://localhost:5000 (checkout the Dockerfile
)
yarn build
yarn serve
Deploy the Oxigraph triplestore and ElasticSearch index using Docker :whale: (requires docker installed)
mkdir -p /data/shapes-of-you/elasticsearch
sudo chown -R 1000:0 /data/shapes-of-you/elasticsearch
docker-compose up -d
Checkout the docker-compose.yml file to see how we run the Docker image.
Requirements: Python 3.6+, git
This script is run every day by the mighty .github/workflows/index-shapes.yml
workflow
The Python script retrieves shapes files from various popular Git services API (GitHub GraphQL API, GitLab API , Gitee API), and generates RDF data. The RDF data is then automatically published to the publicly available triplestore by the GitHub workflow.
You can find the python scripts and requirements in the etl
folder.
Use this command to locally define the API_GITHUB_TOKEN
, GITLAB_TOKEN
and GITEE_TOKEN
environment variables required to run the script (you might need to adapt on Windows, but you should know better than me):
export API_GITHUB_TOKEN=MYGITHUBTOKEN000
export GITLAB_TOKEN=MYGITLABTOKEN000
export GITEE_TOKEN=MYGITEETOKEN000
Add those commands to your
.zshrc
or.bashrc
to make it permanent
For GitHub you can create a new GitHub API key (aka. personal access token) at https://github.com/settings/tokens
Go to the etl
folder:
cd etl
Install the requirements:
pip install -e .
Retrieve shapes files from search the GitHub GraphQL API (you can also use a topic to search, e.g. topic:sparql
):
python3 main.py github vemonet/shapes-of-you
Retrieve shapes files from GitLab API using the python-gitlab
package:
python3 main.py gitlab sparql
Retrieve shapes files from Gitee API:
python3 main.py gitee ontology
This task is performed every day by the swifty .github/workflows/analyze-endpoints.yml
workflow
We use the d2s
tool (aka. data2services) to generate HCLS metadata for a SPARQL endpoint:
pip install d2s
d2s metadata analyze https://graphdb.dumontierlab.com/repositories/shapes-registry -o metadata.ttl
We commit the generated metadata file to the metadata
branch, to experiment using git to version and keep track of changes of the metadata generated for the SPARQL endpoints over time.
Enable WebDAV LDP on Virtuoso 7 (from the official Virtuoso documentation)
Start the virtuoso-opensource-7
docker image
docker-compose up -d
The first time you start Virtuoso, or after you reset the database, you will need to run this script to prepare the Linked Data Platform:
./prepare_virtuoso.sh
To prepare for shapes-of-you, create folders github
, gitlab
, gitee
, apis
and endpoints
using the same owner and permission as for the ldp
folder.
Test by uploading a turtle file to the LDP (change the password before):
curl -u ldp:$ENDPOINT_PASSWORD --data-binary @shapes-rdf.ttl -H "Accept: text/turtle" -H "Content-type: text/turtle" -H "Slug: test-shapes-rdf" https://data.index.semanticscience.org/DAV/home/ldp/github
Enable CORS to query the Virtuoso SPARQL endpoint from JavaScript. See the Virtuoso CORS documentation.
/sparql
Logical Path > click Edit\*
in the Cross-Origin Resource Sharing input field.Contributions are welcome! See the guidelines to contribute.
RDF data hosted in a Oxigraph triplestore (open source)
OpenAPI powered by grlc.io
SPARQL query UI powered by Triply's YASGUI
Ontology built with gra.fo
Data processing workflows run for free using GitHub Actions open source plan
Files parsed using python libraries: rdflib
, obonet
, prance