sib-swiss / sparql-examples

A set of SPARQL examples that are used in different SIB resources
https://sib-swiss.github.io/sparql-examples
Other
9 stars 10 forks source link

Export all federated queries to create a real-world benchmark for federated queries #40

Open vemonet opened 1 month ago

vemonet commented 1 month ago

This repository contains a lot of complex federated queries to large endpoints.

It would be interesting to provide some instructions to easily export all federated queries to constitute a benchmark that could be used by federated query systems.

Another comparable benchmark would be: https://github.com/dice-group/LargeRDFBench

But this benchmark would provide queries that are actually used in the real world.

constraintAutomaton commented 3 weeks ago

I made this script to extract the queries @vemonet .

https://github.com/constraintAutomaton/sib-swiss-federated-query-extractor

I changed the queries provided in the repo because they do not seem to work with the data model. I used this one instead.

PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX spex: <https://purl.expasy.org/sparql-examples/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?queryID ?federatedEndpoint ?comment ?query ?target  WHERE {
  ?queryID sh:select ?query .
  ?queryID spex:federatesWith ?federatedEndpoint .
  ?queryID rdfs:comment ?comment .
  ?queryID <https://schema.org/target> ?target
}

At least on my side no queries had more than one <https://schema.org/target> and spex:federatesWith seems to be matching the number of endpoint in the federation.

constraintAutomaton commented 3 weeks ago

Maybe, I can document how I've done it and provide my repo as an example, after some cleanup. Unless, I made a mistake somewhere.

vemonet commented 3 weeks ago

Thanks @constraintAutomaton that's nice! A few remarks:

Something a bit like:


{
  "queries": [ 
    {
    "uri": "https://www.bgee.org/sparql/.well-known/sparql-examples/020",
    "endpoint": "https://www.bgee.org/sparql/",
    "query": "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\nPREFIX up: <http://purl.uniprot.org/core/>\nPREFIX genex: <http://purl.org/genex#>\nPREFIX obo: <http://purl.obolibrary.org/obo/>\nPREFIX orth: <http://purl.org/net/orth#>\nPREFIX dcterms: <http://purl.org/dc/terms/>\nPREFIX sio: <http://semanticscience.org/resource/>\n\nSELECT DISTINCT ?flyEnsemblGene ?orthologTaxon ?orthologEnsemblGene ?orthologOmaLink WHERE {\n\t{\n        SELECT DISTINCT ?gene ?flyEnsemblGene {\n        ?gene a orth:Gene ;\n            genex:isExpressedIn/rdfs:label 'eye' ;\n            orth:organism/obo:RO_0002162 ?taxon ;\n            dcterms:identifier ?flyEnsemblGene .\n        ?taxon up:commonName 'fruit fly' .\n        } LIMIT 100\n    }\n    SERVICE <https://sparql.omabrowser.org/sparql> {\n        ?protein2 a orth:Protein .\n        ?protein1 a orth:Protein .\n        ?clusterPrimates a orth:OrthologsCluster .\n        ?cluster a orth:OrthologsCluster ;\n            orth:hasHomologousMember ?node1 ;\n            orth:hasHomologousMember ?node2 .\n        ?node1 orth:hasHomologousMember* ?protein1 .\n        ?node2 orth:hasHomologousMember* ?clusterPrimates .\n        ?clusterPrimates orth:hasHomologousMember* ?protein2 .\n        ?protein1 sio:SIO_010079 ?gene . # is encoded by\n        ?protein2 rdfs:seeAlso ?orthologOmaLink ;\n            orth:organism/obo:RO_0002162 ?orthologTaxonUri ;\n            sio:SIO_010079 ?orthologGene . # is encoded by\n        ?clusterPrimates orth:hasTaxonomicRange ?taxRange .\n        ?taxRange orth:taxRange 'Primates' .\n        FILTER ( ?node1 != ?node2 )\n    }\n    ?orthologTaxonUri up:commonName ?orthologTaxon .\n    ?orthologGene dcterms:identifier ?orthologEnsemblGene .\n}",
    "description": "Which are the genes in Primates orthologous to a gene that is expressed in the fruit fly's eye?",
    "federatesWith": [
      "https://www.bgee.org/sparql/",
      "https://sparql.omabrowser.org/sparql"
    ],
    }
    ...
  ],
  "metadata": ...
  },
constraintAutomaton commented 2 weeks ago

Thanks @vemonet! I've made the changes.