LUMII-Syslab / viziquer

Tool for Search in Structured Semantic Data
http://viziquer.lumii.lv/
MIT License
16 stars 5 forks source link

Wikidata Q12 is buggy #11

Open VladimirAlexiev opened 1 year ago

VladimirAlexiev commented 1 year ago

https://viziquer.lumii.lv/examples/wikidata2022/SPARQL_to_ViziQuer_wikidata.pdf

# ID = 12,
# Question = Recent events
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?event ?date ?eventLabel WHERE{
 ?event wdt:P31/wdt:P279* wd:Q1190554.
 OPTIONAL{?event wdt:P585 ?date.}
 OPTIONAL{?event wdt:P580 ?date.}
 OPTIONAL{?event rdfs:label ?eventLabel. FILTER(LANG(?eventLabel) = 'en')}
 BIND(NOW()-?date AS ?distance)
 FILTER(BOUND(?date) && DATATYPE(?date) =xsd:dateTime)
 FILTER(0 <= ?distance && ?distance < 31) }
LIMIT 10

This query is buggy:

select ?event ?eventLabel ?date1 ?date2 {
   ?event wdt:P31 wd:Q1190554.
   ?event wdt:P585 ?date1,?date2.
   filter(?date1<?date2)
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} limit 10

The bugs come from the original WD query

karlisc commented 1 year ago

Thanks for the notice! In fact, we did not check the validity of the original query, in this work we took the SPARQL queries as they are (as they were at the point we considered them) and tried to see, what can we do with the generation of the visual form. I might come up with some more specific comments on this query in a couple of days. Meanwhile, if you think that you have a better SPARQL query, we could try to visualize that.

karlisc commented 1 year ago

The following SPARQL query (perhaps corresponding better to the textual formulation)

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?event ?eventLabel (MIN(?date) AS ?mindate) WHERE{
  ?event wdt:P31/wdt:P279* wd:Q1190554.
  ?event (wdt:P585|wdt:P580) ?date.
  OPTIONAL{?event rdfs:label ?eventLabel. FILTER(LANG(?eventLabel) = 'en')}
  BIND(NOW()-?date AS ?distance)
  FILTER(BOUND(?date) && DATATYPE(?date) = xsd:dateTime)
  FILTER(0 <= ?distance && ?distance < 31)
}
GROUP BY ?event ?eventLabel
LIMIT 10

can be visualized, as follows: image We are working to tune the path expressions of properties enriched with labels in the attribute position to allow a presentation like the following one (currently not yet working in full): image

karlisc commented 1 year ago

The query visualization and visual query creation over various data endpoints, including wikidata and DBPedia, can now be done at https://viziquer.app by any interested person (the query libraries can be preloaded, looked at and modified).

VladimirAlexiev commented 1 year ago

Thanks @karlisc !

karlisc commented 1 year ago

Thanks @VladimirAlexiev for the note. The query with wdt:P31/wdt:P279* times out for me, as well. I am not sure, if we could be supposed to do anything about that (rather not). This might be related to the more general understanding that the Blazegraph-based wikidata endpoint is close to its technical limits. I could think of developing certain services for wikidata-specific visual queries (e.g. including dedicated support for qualifiers), however, it would make much more sense, if custom SPARQL were a reliable means for information extraction from wikidata. Regarding this ticket, I would see that we would need to implement fully the other visual form (the one with the single orange box), whatever long it takes (due to different work priorities), then it perhaps could be closed.