cygri / void

An RDF schema and associated documentation for expressing metadata about RDF datasets
http://www.w3.org/TR/void/
14 stars 1 forks source link

Provenance of links #109

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Hi, 

When doing research in the field of data interlinking, we often use linked data 
sets from the LOD cloud. In my opinion, it would be very useful to have some 
information on the provenance of links within voiD descriptions of data sets 
(i.e. related to void:Linkset). For example:

* how were the links generated? automatically / semi-automatically / manually
* in case the process was done (semi-)automatically, which was the interlinking 
tool used? Silk, MeLinDa, LIMES, LogMap ...
* in case the process was done manually, who were the agents involved in the 
definition of links?

I need manually generated or manually reviewed links for an evaluation, and I 
could hardly find this information at the DataHub or in voiD files. 

Kind regards, 
Cristina

Original issue reported on code.google.com by csarasua...@gmail.com on 2 Sep 2013 at 11:36

GoogleCodeExporter commented 9 years ago
This is related to Issue #3 (provenance information). I'd say that existing 
provenance vocabularies, including W3C's PROV-O, are quite suitable and 
probably sufficient for expressing this kind of information. 

The only thing not obviously covered there is the distinction between 
manual/semi-automatic/fully-automatic processes. For this, a specialisation of 
PROV-O focused specifically on expressing the provenance of RDF links could be 
a good idea.

Whether publishers will include such information in their VoID files is of 
course a separate question. A first step here would be to encourage the 
developers of link generation tools to generate provenance information along 
with the links, in a consistent format.

Original comment by richard....@gmail.com on 2 Sep 2013 at 12:06