Open Daniel-Mietchen opened 2 years ago
I just created a simplified version of one of our queries - country_authors.sparql
SELECT
?author
(COUNT(DISTINCT ?citing_work) AS ?number_of_citing_works)
(SAMPLE(?organization_) AS ?organization)
(SAMPLE(?work) AS ?example_work)
WHERE {
?author wdt:P27 | wdt:P1416/wdt:P17 | wdt:P108/wdt:P17 wd:Q35 .
?work wdt:P50 ?author .
OPTIONAL { ?citing_work wdt:P2860 ?work . }
OPTIONAL {
?author wdt:P1416 | wdt:P108 ?organization_ .
?organization_ wdt:P17 wd:Q35 .
}
}
GROUP BY ?author
It times out on Wikidata, fails on QLever and executes on that Virtuoso instance.
The query runs successfully on some of our endpoints
date;sparqlquery -qn authorsCitingWork -en blazegraph -f github;date
Virtuoso-on-AWS: https://wikidata.demo.openlinksw.com/sparql
(Does not support the Wikidata blazegraph functions)
https://ceur-ws.org/Vol-3262/paper9.pdf and https://wiki.bitplan.com/index.php/Get_your_own_copy_of_WikiData have a list of candidates. I also intend to talk to the wikidata team on the next meeting and would love to have a proper blazegraph mirror running at our RWTH Aachen i5 http://wikidata.dbis.rwth-aachen.de/ machine which should be suitable for the task with 256 GB RAM and 10 TB SSD. I never got a proper blazegraph mirror endpoint with all necessary special services running in the past 6 years that i have been attempting to get my own copy of wikidata running.
Oh, you're in Aachen?
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
I'd like us to explore running Scholia on other SPARQL endpoints, Blazegraph or otherwise. We have done some of this in a past, but not in a way that would be scalable across all Scholia queries.
Describe alternatives you've considered
A relatively straightforward approach might be to build a workflow based on running Scholia via the SPARQL endpoint (default: Blazegraph again) of a dedicated Wikibase instance that holds a copy of a recent Wikidata dump. There could even be several such Wikibases, each serving a specific subset (e.g. per Scholia aspect).
Additional context
Other options would be to start exploring non-Blazegraph endpoints, e.g. https://wikidata.demo.openlinksw.com/sparql (running on Virtuoso) or https://qlever.cs.uni-freiburg.de/wikidata/ (running on QLever)
1721