Open kurzum opened 6 years ago
renamed wikidata-wikidata to wikidata-debug and copied the explanations from Ali into the pom.xml
Open questions:
http://downloads.dbpedia.org/temporary/wikidata-structure.txt Note that I am renaming, i.e. remove wikidata- from the folders
what is the difference between wikidata-transitive-redirects and wikidata-redirects
These come from the default extraction that is performed from all languages. Like common Wikipedia redirects, Wikidata also has redirects in Q*
items. wikidata-redrects has the explicit redirects wile transitive computes the redirect closure.
should we publish the .obj file? Is there anybody who would do something with the published file?
This also comes from the default extraction of wikimedia dumps in all languages and can be ignored
page-id page-length
This is also a default extractor that is available on all languages and computes the entity length in terms of number of characters. If you do not provide these in the other languages I suspect it can be ignored from Wikidata as well
wikidata-wikidata-duplicate-iri-split
This is used to do a more accurate representation of Wikidata as RDF reification. I see that we are using the simple view export in most cases so it could potentially be ignored as well.
wikidata-wikidata-rmapping-errors
This dataset contains invalid triples from mapping errors (i.e. property expected an IRI but got a literal, etc) they can be used to improve the DBpedia-to-Wikidata mappings but would result in schema errors if thos was loaded in an endpoint.
wikidata-wikidata-type-like-statements
not sure actually, @alismayilov do you remeber?
the wikidata datasets are now described via pom.xml: https://github.com/dbpedia/databus-maven-plugin/tree/master/dbpedia/wikidata
There are the following todos:
configuration><labels> and <datasetDescription>