Open vemonet opened 4 weeks ago
Using requests
with the most logical config to request a SPARQL endpoint just works, so the problem is on SPARQLWrapper doing weird things internally:
import requests
from rdflib import Graph
query = """PREFIX up: <http://purl.uniprot.org/core/>
PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
CONSTRUCT
{
?protein a up:HumanProtein .
}
WHERE
{
?protein a up:Protein .
?protein up:organism taxon:9606
} LIMIT 10"""
response = requests.post(
"https://sparql.uniprot.org/sparql/",
headers={
"Accept": "text/turtle"
},
data={
"query": query
},
timeout=60,
)
response.raise_for_status()
g = Graph()
g.parse(data=response.text, format="turtle")
print(response.text)
print(len(g))
In bonus we get basic features like timeout working! (the .setTimeout()
option from SPARQLWrapper does not work at all, at least for UniProt endpoint, but this should go in another issue)
UniProt is not pure virtuoso and has some middleware that expects accept headers to ask for an rdf format if using describe and or construct.
@JervenBolleman SPARQLWrapper also fails to run SELECT queries to SwissLipids https://beta.sparql.swisslipids.org/
Error 500 Internal Server Error</h1><p>The server was not able to handle your request.
:
from SPARQLWrapper import XML, SPARQLWrapper, JSON
query = """PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?comment ?query
WHERE
{
?sq a sh:SPARQLExecutable ;
rdfs:label|rdfs:comment ?comment ;
sh:select|sh:ask|sh:construct|sh:describe ?query .
}"""
sparql_endpoint = SPARQLWrapper("https://beta.sparql.swisslipids.org/")
sparql_endpoint.setReturnFormat(XML)
sparql_endpoint.setTimeout(60)
sparql_endpoint.setQuery(query)
results = sparql_endpoint.query().convert()
print(results)
With requests
it works:
import requests
query = """PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?comment ?query
WHERE
{
?sq a sh:SPARQLExecutable ;
rdfs:label|rdfs:comment ?comment ;
sh:select|sh:ask|sh:construct|sh:describe ?query .
}"""
response = requests.post(
"https://beta.sparql.swisslipids.org/",
headers={
"Accept": "application/json",
"User-agent": "sparqlwrapper 2.0.1a0 (rdflib.github.io/sparqlwrapper)"
},
data={
"query": query
},
timeout=60,
)
try:
response.raise_for_status()
print(response.json())
except requests.exceptions.HTTPError as e:
print(e)
print(response.text)
When running any
CONSTRUCT
orDESCRIBE
query on the UniProt SPARQL endpoint https://sparql.uniprot.org/sparql/, whatever the return format asked (XML, turtle) SPARQLWrapper fails to resolve the queryCode to reproduce:
When asking for XML at least an error is thrown:
Error message:
When asking for turtle, SPARQLWrapper does not even throw an error:
Printing results gives HTML:
b'<!DOCTYPE html SYSTEM "about:legacy-compat">\n<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head><title>UniProt</title>......
UniProt uses OpenLink Virtuoso and supports the SPARQL 1.1 Standard.