beeldengeluid / beng-lod-server

LOD server for B&G catalogue
MIT License
1 stars 1 forks source link

use SPARQL SERVICE directive from rdfllib #223

Open wmelder opened 2 years ago

wmelder commented 2 years ago

this is an example given by rdflib: https://rdflib.readthedocs.io/en/stable/intro_to_sparql.html#querying-a-remote-service.

wmelder commented 2 years ago

@gb-beng First response I get using the code from here:

import rdflib

def get_datasets(sparql_endpoint: str):
  g = rdflib.Graph()
  qres = g.query(
      """
      PREFIX sdo: <http://schema.org>
      SELECT ?dataset
      WHERE {
        SERVICE <%s> {
          ?s sdo:DataCatalog . ?s sdo:dataset ?dataset
        }
      }
    """ % sparql_endpoint
  )
  datasets =  [row.dataset for row in qres]

gives

pyparsing.exceptions.ParseException: Expected {SelectQuery | ConstructQuery | DescribeQuery | AskQuery}, found 'SERVICE'  (at char 83), (line:5, col:9)
gb-beng commented 2 years ago

There were two issues with that suggestion:

The following seems to work!

>>> def get_datasets(sparql_endpoint: str):
...   g = rdflib.Graph()
...   qres = g.query(
...       """
...       PREFIX sdo: <https://schema.org/>
...       SELECT ?dataset
...       WHERE {
...         SERVICE <%s> {
...           ?s a sdo:DataCatalog . ?s sdo:dataset ?dataset
...         }
...       }
...     """ % sparql_endpoint
...   )
...   datasets =  [row.dataset for row in qres]
...   return datasets
... 
>>> get_datasets("https://cat.apis.beeldengeluid.nl/sparql")
[rdflib.term.URIRef('http://data.beeldengeluid.nl/id/dataset/0001'), ...]
>>> 
wmelder commented 2 years ago

but suddenly it is not working anymore:

urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1131)>
wmelder commented 2 years ago

Using prepareQuery and a variable for the service directive doesn't work:

from rdflib import Graph, URIRef
from rdflib.namespace import SDO
from rdflib.plugins.sparql import prepareQuery

def get_datasets(sparql_endpoint: str):
    g = Graph()
    qprep = prepareQuery(
        """SELECT ?dataset
       WHERE {
         SERVICE ?service {
           ?s a sdo:DataCatalog . ?s sdo:dataset ?dataset
         }
       }
        """,
        initNs={"sdo": SDO}
    )
    datasets = [row.dataset for row in g.query(qprep, initBindings={'service': URIRef(sparql_endpoint)})]
    return datasets
wmelder commented 2 years ago

Now trying to bind to an RDF store, but this error: ValueError: You did something wrong formulating either the URI or your SPARQL query

from rdflib.plugins.stores.sparqlstore import SPARQLStore

def get_datasets(sparql_endpoint: str) -> List[str]:
    rdf_store = SPARQLStore(sparql_endpoint, initNs={"sdo": SDO}, returnFormat="json")
    q = "SELECT ?dataset WHERE { ?s a sdo:DataCatalog . ?s sdo:dataset ?dataset }"
    qres = rdf_store.query(q)
    datasets = [row.dataset for row in qres]
    return datasets

There is also the possibility to use g = Graph(rdf_store), so that you can still use prepareQuery, but this also raises a ValueError at this moment.

wmelder commented 2 years ago

mmm maar bovenstaande error wordt vooraf gegaan door ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1131)