LeMyst / WikibaseIntegrator

A Python module to manipulate data on a Wikibase instance (like Wikidata) through the MediaWiki Wikibase API and the Wikibase SPARQL endpoint.
MIT License
67 stars 14 forks source link

Adding Prefixes to SPARQL query helper causes "does not look like a valid URI, trying to serialize this will break" #741

Open Superraptor opened 1 month ago

Superraptor commented 1 month ago

This isn't a huge issue, just more annoying with large datasets, but essentially, SPARQL queries work regardless of whether prefixes are defined in them or not.

However, with prefixes, I always get a warning that is PREFIX wd: <https://wikiname.wikibase.cloud/entity/> PREFIX wdt: <https://wikiname.wikibase.cloud/prop/direct/> SELECT ?p ?o WHERE { <https://wikiname.wikibase.cloud/entity/Q4087> ?p ?o } does not look like a valid URI, trying to serialize this will break.. Since I'm running thousands of queries though-- this pops up a lot.

Is there either (a) a way to stop this warning or solve whatever serialization error is occurring (either by declaring prefixes in a different place or changing the query, or (b) a way to silence or suppress these warnings if (a) isn't possible?

Thanks so much!

LeMyst commented 1 month ago

Hello @Superraptor , I don't understand your issue, maybe there is a missing part, do you have a code example to share? If you don't use prefix, you don't need to declare them. I can't reproduce your issue on Wikidata Query Service, maybe some of your elements have data issue, that would explain why you don't have the issue everytime? https://w.wiki/Agye

Superraptor commented 1 month ago

@LeMyst thanks for responding!

So I'm basically running this:

queryPrefixes = "PREFIX wd: <"+namespaces['wb']+"> PREFIX wdt: <"+namespaces['p']+"> "

queryResult1 = sparqlQuery(queryPrefixes+"SELECT ?p ?o WHERE { <"+resp["s"]["value"]+"> ?p ?o }", wbi_config) # This resolves to, for example: PREFIX wd: <https://wikiname.wikibase.cloud/entity/> PREFIX wdt: <https://wikiname.wikibase.cloud/prop/direct/> SELECT ?p ?o WHERE { <https://wikiname.wikibase.cloud/entity/Q4087> ?p ?o }

def sparqlQuery(query, wbi_config):
    responseJSON = wbi_helpers.execute_sparql_query(query, None, wbi_config['SPARQL_ENDPOINT_URL'], None, 10, 1)
    return responseJSON

Each time this runs, it works, but it prints this to console:

PREFIX wd: <https://wikiname.wikibase.cloud/entity/> PREFIX wdt: <https://wikiname.wikibase.cloud/prop/direct/> SELECT ?p ?o WHERE { <https://wikiname.wikibase.cloud/entity/Q4087> ?p ?o }  does not look like a valid URI, trying to serialize this will break.

If I run this:

queryResult1 = "SELECT ?p ?o WHERE { <"+resp["s"]["value"]+"> ?p ?o }", wbi_config) # This resolves to, for example: SELECT ?p ?o WHERE { <https://wikiname.wikibase.cloud/entity/Q4087> ?p ?o }

def sparqlQuery(query, wbi_config):
    responseJSON = wbi_helpers.execute_sparql_query(query, None, wbi_config['SPARQL_ENDPOINT_URL'], None, 10, 1)
    return responseJSON

It works and does not print anything to console.

This would be fine, but I do have some queries where the prefixes have to be declared. And when you're running 10,000+ queries, having a warning print to console 10,000+ times really clutters things.

LeMyst commented 1 month ago

I'm not sure what in WikibaseIntegrator can create this behavior. You can try again with more logs by adding theses lines:

import logging
logging.basicConfig(level=logging.DEBUG)

I don't think this will help at all, but, for your information, execute_sparql_query accept a "prefix" parameter where you can add your prefixes (but in the end, it only concatenate the prefix and your query)

dpriskorn commented 1 month ago

"does not look like a valid URI" did not return any results in the repo. This is propagated from somewhere else... Here is the culprit: https://github.com/RDFLib/rdflib/blob/ccb9c4a56e6bfcf1474480552e62a21461b85239/rdflib/term.py#L276 But rdflib is not used in WBI so I suggest closing this as invalid because it is not WBI related.