✨️ Execute SPARQL queries from string, URL or multiple files using the RDF4J framework.
vemonet commented 5 years ago

Will be implemented as a standalone tool in data2services-sparql-operations as "expand" operation using PrefixCommons registry. It is included in split at the moment.

Like split it will take a (list of?) class and property to resolve the value (e.g. bl:Drug bl:id )

This operation will insert all the statements of the class using the pref URI as subject. And a param will enable to delete previous statement.

Bonus: add a wrapper on top of BridgeDB to integrate BridgeDB identifiers resolution in data2services-sparql-operations the same way as PrefixCommons.


vemonet commented 5 years ago

Concept for a SPARQL service like WikiData or from the Life Science Registry spreadsheet

Identifiers resolution SPARQL Service specifications

Implementation details

We will build a SPARQL Service that can be use to resolve identifiers and URIs to get a canonical URI. This Identifiers SPARQL Service will propose a framework and ontology to enable users to resolve URIs efficiently through federated SPARQL querying.


The Identifiers SPARQL Service will enable various identifiers resolver services (BridgeDB, to be queries through a public SPARQL Service. The different resolvers can be accessed through their own graph in the same SPARQL Service, enabling the user to choose which resolver he wants to use, use a subset, or all of them.

The resolvers can be connected to the SPARQL Service through various methods

Resolvers to implement

The following identifiers resolvers will be implemented to start:


We will build an ontology to define standards relations between identifiers and URIs, but new properties can be used to define new relations.

PREFIX idot: <>
?ref idot:preferredPrefix "chembl" ;
  idot:alternatePrefix "chembldb" ;
  idot:identifierPattern "CHEMBL\\d+"^^xsd:string ;
  idot:exampleIdentifier "CHEMBL25"^^xsd:string ;
  idot:accessPattern "", 
    "" .

SPARQL query examples for the Life Science Registry

Resolve common URIs syntax variants for a same entity using the Bio2RDF Life Science Registry spreadsheet

Get reference URI

From any prefix:id or URI, get the canonical reference (URI).

We usedct:alternative from the LifeScienceRegistry graph resolves all URIs variants for a same identifier: , , , , , "uniprot:P00734"

PREFIX dct: <>
PREFIX bl: <>
SELECT ?s ?ref ?source WHERE {
  ?s a bl:Drug ;
    dct:identifier ?id .
  SERVICE <> {
    GRAPH ?source {
      ?ref dct:alternative ?id .

# Get identifier only from the LifeScienceRegistry service (uses dct:alternative)
SELECT ?s ?ref ?source WHERE {
  ?s a bl:Drug ;
    dct:identifier ?id .
  SERVICE <> {
    GRAPH <> {
      ?ref dct:alternative ?id .

Get alternative URIs

From a canonical reference, get all the commonly accepted URIs (we use dct:alternative property)

PREFIX dct: <>
PREFIX bl: <>
SELECT ?ref ?ids ?source WHERE {
  ?ref a bl:Drug .
  SERVICE <> {
    GRAPH ?source {
      ?ref dct:alternative ?ids .

Get alternative IDs

From a canonical reference, get all the available variants IDs of the entity in other databases (with data sources, which relation, metadata), ???).

PREFIX dct: <>
PREFIX bl: <>
SELECT ?ref ?p ?ids ?source WHERE {
  ?ref a bl:Drug .
  SERVICE <> {
    GRAPH ?source {
      ?ref ?p ?ids .
      # We could add a filter on ?p to take only predicates about alternative IDs

Combination: get reference URI, then all possible alternatives

Support subqueries

PREFIX dct: <>
PREFIX bl: <>
SELECT ?s ?ref ?alternatives ?sourceRef ?sourceAlt WHERE {
  ?s a bl:Drug ;
    dct:identifier ?id .
  SERVICE <> {
    SELECT ?ref WHERE {
      GRAPH ?sourceRef {
        ?ref dct:alternative ?id .
    GRAPH ?sourceAlt {
      ?ref dct:alternative ?alternatives .

OPTIONAL: get reference URIs from id without prefix

From an id, get all the possible canonical references (URIs) using SPARQL filter.

PREFIX dct: <>
PREFIX bl: <>
# Exact match
SELECT ?s ?ref ?source WHERE {
  ?s a bl:Drug ;
    dct:identifier ?id .
  SERVICE <> {
    GRAPH ?source {
      ?ref dct:identifier ?id .
# Regex FILTER to get all URIs starting with https or
SELECT ?s ?ref ?source WHERE {
  SERVICE <> {
    GRAPH ?source {
      ?ref dct:identifier ?id .
      FILTER regex( str(?id), "http[s]?:\/\/\/" )

See for label service

SPARQL service:

EBI SPARQL service code:

Example of use

PREFIX owl: <>

    SERVICE <>{
      <> owl:sameAs ?go .

It look like he is generating the store on the fly out of a IdentifiersOrg store:

vemonet commented 5 years ago

Will be implemented as a standalone tool in data2services-sparql-operations as "expand" operation using PrefixCommons registry.

