dbpedia / databus-maven-plugin

Databus Maven Plugin: Aligning Data and Software Lifecycle with Maven
GNU Affero General Public License v3.0
6 stars 10 forks source link

check wikidata artifact/dataset naming #21

Open kurzum opened 6 years ago

kurzum commented 6 years ago

the wikidata datasets are now described via pom.xml: https://github.com/dbpedia/databus-maven-plugin/tree/master/dbpedia/wikidata

There are the following todos:

kurzum commented 6 years ago

renamed wikidata-wikidata to wikidata-debug and copied the explanations from Ali into the pom.xml

kurzum commented 5 years ago

Open questions:

http://downloads.dbpedia.org/temporary/wikidata-structure.txt Note that I am renaming, i.e. remove wikidata- from the folders

jimkont commented 5 years ago

what is the difference between wikidata-transitive-redirects and wikidata-redirects

These come from the default extraction that is performed from all languages. Like common Wikipedia redirects, Wikidata also has redirects in Q* items. wikidata-redrects has the explicit redirects wile transitive computes the redirect closure.

should we publish the .obj file? Is there anybody who would do something with the published file?

This also comes from the default extraction of wikimedia dumps in all languages and can be ignored

page-id page-length

This is also a default extractor that is available on all languages and computes the entity length in terms of number of characters. If you do not provide these in the other languages I suspect it can be ignored from Wikidata as well

wikidata-wikidata-duplicate-iri-split

This is used to do a more accurate representation of Wikidata as RDF reification. I see that we are using the simple view export in most cases so it could potentially be ignored as well.

wikidata-wikidata-rmapping-errors

This dataset contains invalid triples from mapping errors (i.e. property expected an IRI but got a literal, etc) they can be used to improve the DBpedia-to-Wikidata mappings but would result in schema errors if thos was loaded in an endpoint.

wikidata-wikidata-type-like-statements

not sure actually, @alismayilov do you remeber?