Closed GoogleCodeExporter closed 9 years ago
Sounds like a good idea. +1 to address this; need to discuss if in MR2 or later
- this is related to Issue 63.
Original comment by Michael.Hausenblas
on 31 Oct 2010 at 10:03
what would the range and definition be? As a consumer of voiD data, what can I
expect to be able to do with the value of this property?
Original comment by K.J.W.Al...@gmail.com
on 31 Oct 2010 at 11:45
I think,
void:entrypoint rdfs:range void:Dataset.
and,
{ ?d void:entrypoint ?s } => { ?d void:subset ?s }.
so maybe,
void:entrypoint rdfs:subPropertyOf void:subset.
A consumer would expect the value of this property to dereference,
and additionally to be able to dereference all objects there, that
are in the dataset, so,
1. dereference the entrypoint/subset, put in store
2. find links en the entrypoint/subset, ?e with a query like,
SELECT ?o WHERE
{
?d void:subset ?e .
?d void:uriRegexp ?r .
GRAPH ?e { ?s ?p ?o }
FILTER (regexp(?o, ?r))
}
3. repeat until no new graphs are encountered
The difference from void:subset is that applying this algorithm
to void:entrypoint should guarantee that the crawl is complete.
Original comment by wwai...@gmail.com
on 31 Oct 2010 at 12:03
The name and description of the property should make clear that the motivation
here is about crawling the entire dataset.
Perhaps call it void:entryResource? void:rootResource? void:topResource?
And something along these lines:
“Many datasets are structured in a tree-like fashion, with one or few natural
“top concepts” or “entry points”, to which all other entities are
connected through a small number of steps. Using this property implies 1. that
the entry resource is a central entity of particular importance in the dataset;
and 2. that the entire dataset can be crawled by resolving the entry resource
and recursively following links to other URIs in the retrieved RDF response.”
I wouldn't relate it to void:subset, because that would imply that the object
is a void:Dataset, and would preclude the use of top entities in an entity
hierarchy, like say a skos:ConceptScheme or foaf:Organization. So I'd just
leave the range open.
But domain should be void:Dataset obviously, and perhaps make it a subproperty
of void:exampleResource?
Original comment by richard....@gmail.com
on 31 Oct 2010 at 1:12
cygri wrote:
> I wouldn't relate it to void:subset, because that would imply that the object is a void:Dataset
Isn't it the case that anything you get by dereferencing any URI in the dataset
(modulo
matching uriRegexp) is a void:subset? I read void:subset as almost equivalent to
rdfg:subGraph...
Original comment by wwai...@gmail.com
on 31 Oct 2010 at 2:05
@wwaites: No. A foaf:Person has a dereferenceable URI, but definitely is not a
void:Dataset.
Also, quoting the definition of void:Dataset:
“A dataset is a set of RDF triples that are published, maintained or
aggregated by a single provider. Unlike RDF graphs, which are purely
mathematical constructs [RDF Concepts], the term dataset has a social
dimension: We think of a dataset as a meaningful collection of triples, that
deal with a certain topic, originate from a certain source or process, are
hosted on a certain server, or are aggregated by a certain custodian. Also,
typically a dataset is accessible on the Web, for example through resolvable
HTTP URIs or through a SPARQL endpoint, and it contains sufficiently many
triples that there is benefit in providing a concise summary.”
That should explain the difference between void:subset and rdfs:subGraph. I
think that Section 1.4 also motivates that distinction quite well.
If you have any issue (no pun intended) with that take on void:subset, please
create a separate issue for it.
Original comment by richard....@gmail.com
on 31 Oct 2010 at 2:34
After some more thinking, I want this feature. This could be useful in our
lodcloud work to define the notion of “bulk accessibility”: To be bulk
accessible, a dataset must have a dump or a crawl entry point that allows
complete crawling. (Cue discussion about whether a SPARQL endpoint enables bulk
access.)
So +1 for doing this, and doing it in Release 2.0.
Original comment by richard....@gmail.com
on 2 Nov 2010 at 9:40
Proposed text for a new section 1.11 is below, as well as proposed RDFS.
1.11 Root resources
Many datasets are structured in a tree-like fashion, with one or a few natural
“top concepts” or “entry points”, and all other entities reachable from
these root resources in a small number of steps.
One or more such root resources can be named using the void:rootResource
property. Naming a resource as a root resource implies 1. that the it is a
central entity of particular importance in the dataset; and 2. that the entire
dataset can be crawled by resolving the root resource(s) and recursively
following links to other URIs in the retrieved RDF response.
Root resources make good entry points for crawling an RDF dataset.
This property is similar to void:exampleResource. While void:exampleResource
names particularly representative or typical resources in the dataset,
void:rootResource names particularly important or central resources that make
good entry points for navigating the dataset.
void:rootResource a rdf:Property;
rdfs:label "Root Resource";
rdfs:comment "A resource of particular importance in a dataset. All resources in a dataset can be reached by following links from its root resources in a small number of steps.";
rdfs:domain void:Dataset;
.
Original comment by richard....@gmail.com
on 24 Nov 2010 at 8:38
Looks good to me. +1 to implement it in the guide/voc
Original comment by Michael.Hausenblas
on 25 Nov 2010 at 8:15
We decided in today's teleconference to implement the proposal from Comment 8,
pending acceptance from Keith
Original comment by richard....@gmail.com
on 7 Dec 2010 at 11:50
We decided in today's teleconference to implement the proposal from Comment 8,
pending acceptance from Keith
Original comment by richard....@gmail.com
on 7 Dec 2010 at 11:50
Implemented in r157. Closing.
Original comment by richard....@gmail.com
on 7 Dec 2010 at 12:20
Original issue reported on code.google.com by
wwai...@gmail.com
on 31 Oct 2010 at 9:49