weso / hercules-sync

Tools to synchronise data between the ontology files and Wikibase instance for the Hercules project at University of Murcia.
GNU General Public License v3.0
4 stars 1 forks source link

Parsing of property datatype #15

Closed alejgh closed 4 years ago

alejgh commented 4 years ago

When we create a new property in wikibase we need to provide the type of the property. The following datatypes are allowed:

More information about the datatypes can be found here.

We need to implement a system that parses the type of each property based on its range in the ontology file. If the property operates on URIs its type will be 'wikibase-item' if the URI is a item, or 'wikibase-property' if the URI is a property (we need to implement #14 to infer this). On the other hand, if a triple has as an object a literal, we need to parse which specific type of literal it is.

Problems to solve

There are some cases where knowing the type of the property based on a unique triple might be hard or even impossible. Let's illustrate this with an example.

Suppose that we have the following triple:

ex:myProperty rdfs:domain ex:Person .

We can infer that ex:myProperty is a property, since it is the object of the rdfs:domain predicate. We can also know that a subject with this property belongs to the class ex:Person. However, we don't know the datatype of this property yet, so we can't introduce it in wikibase.

alejgh commented 4 years ago

Interesting resource to change datatype of a property after it has been created: https://github.com/wikimedia/mediawiki-extensions-Wikibase/blob/master/repo/maintenance/changePropertyDataType.php

alejgh commented 4 years ago

I think #14 should be implemented before this. Although this can be implemented without a reasoner, it would be cleaner to implement it afterwards, and we would need to perform less modifications.

alejgh commented 4 years ago

Pull request #34 will close this issue when merged.

Support for the following types has been implemented so far:

Support for math, external-id, and url has been postponed since they cause some problems (mostly related with how to represent them in the ontology) that would delay the overall development of the synchronisation system. An issue will be added for these datatypes with the problems that have appeared and some possible solutions. I will have to talk with @spitxa about this.