cygri / void

An RDF schema and associated documentation for expressing metadata about RDF datasets
http://www.w3.org/TR/void/
14 stars 1 forks source link

subset vs. example resource #80

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Consider an RDF transformation of http://ckan.net/package/dbpedia

There are fields of type example/rdf+xml that link to 
http://dbpedia.org/data/DBpedia.rdf

The target is an extract from dbpedia that contains an extract of
dbpedia itself, which happens to contain about the resource,
dbpedia.

One would be tempted to put,

:DBPedia a void:Dataset ;
        void:exampleResource <http://dbpedia.org/data/DBpedia.rdf>.

Except that this .rdf URI really is not a resource in dbpedia at all
but a subset of it. So perhaps more correct would be,

:DBPedia a void:Dataset ;
        void:subset <http://dbpedia.org/data/DBpedia.rdf>.

<http://dbpedia.org/data/DBpedia.rdf> a void:Dataset ;
        void:exampleResource <http://dbpedia.org/resource/DBpedia>.

I'm not sure how much of this is about the LOD conventions for CKAN
use and how much this kind of distinction wants clarification in voiD
-- the examples in the guide (2 draft) are quite clear...

Original issue reported on code.google.com by wwai...@gmail.com on 31 Oct 2010 at 12:14

GoogleCodeExporter commented 9 years ago
And does this mean we have an implicit rule that says,

{ ?super void:subset ?sub . ?sub void:exampleResource ?ex } =>
{ ?super void:exampleResource ?ex}

?

Original comment by wwai...@gmail.com on 31 Oct 2010 at 12:27

GoogleCodeExporter commented 9 years ago
Who are we to say whether <http://dbpedia.org/data/DBpedia.rdf> is a resource 
in that dataset or not? Maybe the publishers see their dataset as containing 
two kinds of resources, a) real-world entities and b) documents describing 
these entities. That would be perfectly valid, and then both 
</resource/DBpedia> and </data/DBpedia.rdf> would be good example resources.

In general, even though I think that </resource/DBpedia> is a much better 
exampleResource, I think it is better not to be fussy about the range of the 
property.

That being said, if they have a void:uriRegexPattern, then the example 
resources should better match that pattern. (And that might be worth stating 
explicitly in the spec.)

Considering each of the 10 million documents in DBpedia a void:subset of 
DBpedia might be technically correct, but is counter-productive and it 
shouldn't be modelled that way. void:subset is for “interesting” subsets 
that someone, for example, might pick for loading into their store. It would be 
good to formalize that a bit better in the spec, but how?

Original comment by richard....@gmail.com on 31 Oct 2010 at 1:32

GoogleCodeExporter commented 9 years ago
cygri wrote:
  > who are we to say ...

Dereferencing the .rdf URI gives no triples with that URI as subject or object.
it is quite possible that that URI doesn't appear anywhere in the DBpedia 
dataset.

I agree that </resource/DBpedia> is a better exampleResource, this might
really be more of a question of the CKAN tag usage -- what does example/rdf+xml
mean? It seems to conflate the encoding for transmission and the data itself...

"interesting" is a slippery concept... much akin to "relevant"... Copious 
cognitive science literature is dedicated to how hard these ideas are to
formalise...

Original comment by wwai...@gmail.com on 31 Oct 2010 at 2:11

GoogleCodeExporter commented 9 years ago
I have added the following text to 1.8:

“Note: Datasets that are published as linked data with resolvable URIs often 
have two distinct URIs for an entity and for the RDF document describing the 
entity [COOL]. True entity URIs should be preferred as void:exampleResources.”

@wwaites, does this address your original issue?

Original comment by richard....@gmail.com on 2 Nov 2010 at 9:12

GoogleCodeExporter commented 9 years ago
I'm closing this issue, as I think the spec is sufficiently clear about subset 
vs. example resource (although there is always still room for improvement). 
@wwaites, please comment and/or re-open if you'd like to propose concrete 
changes to the spec text.

Original comment by richard....@gmail.com on 22 Nov 2010 at 10:45