InvasionBiologyHypotheses / Documentation

Documentation related to building an open, zoomable atlas for invasion science and beyond
Creative Commons Zero v1.0 Universal
1 stars 0 forks source link

Assess the risk of a potential Wikidata Blazegraph failure for enKORE, and explore mitigation options #17

Open Daniel-Mietchen opened 2 years ago

Daniel-Mietchen commented 2 years ago

There is a risk that the Blazegraph instance behind the Wikidata Query Service might hit some hard technical limits in the near future. As https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/Blazegraph_failure_playbook puts it:

How much time we have before catastrophic failure is difficult to predict, but the probability of it occurring is very high within the next 5 years if no action is taken.

The proposed actions involve deleting the largest subgraphs, chiefly amongst which is the scholarly articles subgraph that we are using for enKORE.

In terms of mitigation, one approach would be to set up our own Wikibase instance (with its own Blazegraph instance or other triple store), which is part of the plan anyway, as per

Daniel-Mietchen commented 2 years ago

Here is an update on the current stage of assessing Blazegraph alternatives: https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS-scaling-update-mar-2022#Wikidata_Query_Service_scaling_update,_March_2022 .

Daniel-Mietchen commented 2 years ago

The above update links to https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_alternatives (with more details) but does not link to a video recording of a dedicated session at the recent Data Reuse Days: https://www.youtube.com/watch?v=1nZxY4r5KQs

Daniel-Mietchen commented 2 years ago

Here is an experimental and currently non-public Wikidata Query Service with 10-min timeouts: https://www.wikidata.org/wiki/Wikidata:Orb_Open_Graph .

Daniel-Mietchen commented 2 years ago

See also this approach of creating a REST API wrapper for Wikidata (and DBpedia): https://crafts.gsic.uva.es/