cygri / void

An RDF schema and associated documentation for expressing metadata about RDF datasets
http://www.w3.org/TR/void/
15 stars 1 forks source link

Comments re voiD guide from Stuart Williams #31

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
From: "Williams, Stuart (HP Labs, Bristol)"
Date: Mon, 23 Feb 2009 16:22:28 +0000

Subject: voID

Hello Richard, Michael,

Just had a rapid read through the voID Guide
(http://rdfs.org/ns/void-guide) and thought that I'd offer some comments
for whatever they may be worth...

I think that it would be useful to comment in the difference between the
SPARQL conceptualisation of dataset (a default graph and collection of
named graphs) and the voID conceptualisation of a dataset (which I think is
a single graph - though 

Section 1:
Conjunctive use of dcterms:subject... whilst I think I understand the
pragmatic appeal, given the example I think that the intersection of
Computer Science (some conceptual domain of study/investigation; a Journal
(a form of publication); and Proceedings (a different form of publication
usually arising from a workshop or conference and IIUC dijoint with
Journals); is empty. Yes I know that's very anal (and maybe big 'O'ist) of
me. I think that you have several dimensions squeeze int one - here
computer science truely is a subject domain, but journal and proceeding
really are more modes or category of publication that being subject
domains. I certainly think that the range of dcterms:subject should be
something like skos:Concept (not looked as skos of late). But I think that
composite subjects are hard.

Section 2:
The 'subset' property could do with being renamed 'hasSubset' or
'isSubSetOf' - I think that the sense of it is the former, but at least for
me the directionality does not stick in my memory for long.

The diagram at the start of section 2 is actually a little confusing. It
looks like it presents two datasets :DS1 and :DS2 (each being a collection
of statements) and that each dataset'contains a named subset :LS1 and LS2
respectively, of linksets - in the example expressing populations of links
using foaf:knows, rdfs:seeAso and owl:sameAs properties. However, as the
later examples unfold, :LS1 and :LS2 are not 'subgraphs' of their
respective graphs, they are (optionally) named linkset resources that act
as statement subjects for some statements describing *a* particular
linkset. Indeed even for the example illustrated there are (or would be)
three defined linkset nodes (two in :DS1 and one in :DS2) and the regions
that are demarque as :LS1 and :LS2 don't exist quite as presented (AFAICT).

Section 3:
I don't quite understand how you could attribute a value to
void:numberOfDocuments. Taking a deliberatly obtuse stand, a dataset
contains triples, numbers of  distinct subjects and objects makes sense,
but numbers of documents - doesn't seem to me to be a dimension of such a
dataset.

In the voID ontology, void:LinkSet is defined to be a subclass of
void:Dataset which gives some syntactic convenience in the re-use of
statistical properties (and probably some others too) but I'm not convinced
that ontologically a :LinkSet us a subclass of a :Dataset - particularly in
the form given where an instance of :LinkSet really can only establish that
a single given property is used to link between a pair of :Dataset.

Section 5.1:
Hmmm... lots of scope for confussion.

    <document.rdf> dcterms:isPartOf <void.ttl#MyDataset> .

Kind of curious from the point of view of having previously established
sparql, uriLookup and dump endpoints why one would be remotely interested
in <document.rdf> as being a part of the dataset (unless separately it was
a dataset in its own right with it's own set of endpoints etc).

Original issue reported on code.google.com by Michael.Hausenblas on 4 Mar 2009 at 12:13

GoogleCodeExporter commented 9 years ago
http://groups.google.co.uk/group/void-rdfs-internals/browse_thread/thread/086e1f
455b2e3342#

Michael, how do we  incorporate the changes resulting from this comments back 
into
the guide? then we can close this issue.

Original comment by K.J.W.Al...@gmail.com on 16 Jun 2009 at 11:37

GoogleCodeExporter commented 9 years ago
I tend to agree with Keith's judgement in the link in comment #1. But we should 
see each of Stuart's complaints 
as an opportunity to improve the clarity of the text and explanations. He's a 
smart guy, and if he misunderstands 
then so will many others.

Original comment by richard....@gmail.com on 10 Aug 2009 at 5:37

GoogleCodeExporter commented 9 years ago
I'm creating separate issues for his comments on Sections 1 and 2.

His comment on Section 3 can be addressed by the following proposed change.

In Table 3 in Section 3.1, add the following text to the entry for 
void:numberOfDocuments:

"This dimension is intended for Linked Data deployments where the total number 
of triples or resources is 
sometimes harder to determine than the number of documents; but 
void:numberOfTriples or 
void:numberOfResources should be preferred where practical."

His comment on Section 5.1 should be addressed as part of Issue 32 (about 
void:isPartOf).

Original comment by richard....@gmail.com on 27 Aug 2009 at 8:51

GoogleCodeExporter commented 9 years ago
Applied changes to Table 3. His other comments are to be dealt with in Issue 
32, Issue 40, and Issue 41.

Original comment by richard....@gmail.com on 15 Sep 2009 at 6:22