UB-Mannheim / bbw

Entity linking, entity typing and relation extraction: Matching CSV to a Wikibase instance (e.g., Wikidata) via Meta-lookup
https://ub-mannheim.github.io/bbw/
MIT License
69 stars 8 forks source link

Use batch mode for Wikidata reconciliation #6

Open wetneb opened 3 years ago

wetneb commented 3 years ago

Thanks a lot for this!

I just wanted to note that the Wikidata reconciliation API that you query can be used in batch mode: you can supply multiple reconciliation queries in a single request. This should speed up the resolution of these queries. https://reconciliation-api.github.io/specs/latest/#sending-reconciliation-queries-to-a-service

shigapov commented 3 years ago

Thank you, Antonin @wetneb! We could also optimise our SPARQL-requests to Wikidata using the query hints of the Blazegraph (https://github.com/blazegraph/database/wiki/QueryHints). But I do not know whether the Wikidata-community would be fine with that...

wetneb commented 3 years ago

I don't think it would cause any harm to optimize queries that you are already running anyway, right?

thadguidry commented 3 years ago

@shigapov Also, you might also think about optimizations and subqueries that might involve analysis of https://www.wikidata.org/wiki/Property:P1963 and gaps or differences between objects and statements that include or do not include defined "properties for this type", where many are commonly referred to as disambiguting properties. Those are the closest thing to what we had in Freebase when it was operational https://web.archive.org/web/20151002083332/http://wiki.freebase.com/wiki/Disambiguation But are only 1 signal, and where for any particular type, those properties could certainly even be derived with machine learning, and not only hard coded as they are with P1963 inside Wikidata. Food for thought :-)