Clarify use of void:vocabulary

GoogleCodeExporter commented 9 years ago

Which URI exactly do I use for any given vocabulary?

We say, the one that's the object of isDefinedBy triples for the vocab terms, 
but often isDefinedBy is 
not used in real-world vocabularies.

Should we say "downloadable location"? Should we say "namespace URI"? What 
about trailing 
hashes, leave them or remove them?

I would prefer having some really clear guidance in the Guide.

Original issue reported on code.google.com by richard....@gmail.com on 10 Nov 2009 at 12:26

GoogleCodeExporter commented 9 years ago

This is a good question.

An advantage of specifying the property should point simply to a "downloadable
location" is that data authors could specify a location where a  particular 
version
could be downloaded (this was suggested in another issue on versioning). The 
dataset
author could add triples like:

:Dataset void:vocabulary <vocab-location#> .
<vocab-location-2007-08-09#> vann:preferredNamespaceUri 
<http://xmlns.com/0.1/foaf/> .

The disadvantage of this is that it does complicate discovering which datatsets 
use
which vocabularies, as the actual vocabulary URI might be found in one of two 
places
in the graph pattern. You would need to do something like a UNION, with one 
graph
pattern using the OPTIONAL {} FILTER !bound pattern to exclude the other graph 
pattern.

I'm not sure such a complication is justified (yet) by real-world demand for the
versioning use case ?

Hmm, I would be inclined to simply say it should be the namespace URI of the
vocabulary, including the trailing hash/slash. We can dispense with mention of
rdfs:isDefinedBy. 

What do we mean by vocabulary? a collection of terms under the same namespace, 
where
the terms appear in the dataset as either { ?s ?term ?o }  or  { ?s a ?term }   
?
Or is this too narrow, precluding uses of SKOS for example ?

Original comment by K.J.W.Al...@gmail.com on 10 Nov 2009 at 1:14

GoogleCodeExporter commented 9 years ago

From Keith:

http://myadmin.kwijibo.talis.com/kwijibo-dev3/services/sparql?
query=PREFIX+rdf%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-
ns%23%3E%0D%0APREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-
schema%23%3E%0D%0APREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0D
%0APREFIX
+owl%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2002%2F07%2Fowl%23%3E%0D%0APREFIX+dcterms%3
A+%3Chtt
p%3A%2F%2Fpurl.org%2Fdc%2Fterms%2F%3E%0D%0Aprefix+void%3A+%3Chttp%3A%2F%2Frdfs.o
rg%2Fns%2Fvoi
d%23%3E%0D%0A%0D%0ASELECT+%3Fvocab+%28count%28%3Fdataset%29+as+%3Fno%29+%7B+%0D%
0A%3Fd
ataset+void%3Avocabulary+%3Fvocab+.%0D%0A%7D+GROUP+BY+%3Fvocab%0D%0AORDER+BY+DES
C%28%3F
no%29%0D%0A%0D%0A

Original comment by richard....@gmail.com on 12 Nov 2009 at 3:56

GoogleCodeExporter commented 9 years ago

http://tinyurl.com/yz5bbas is a query of usage of vocabularies  in dataset 
descriptions

Original comment by K.J.W.Al...@gmail.com on 12 Nov 2009 at 3:56

GoogleCodeExporter commented 9 years ago

http://tinyurl.com/yz5bbas is a query of usage of vocabularies  in dataset 
descriptions

Original comment by K.J.W.Al...@gmail.com on 12 Nov 2009 at 3:56

GoogleCodeExporter commented 9 years ago

http://tinyurl.com/yfm83el for rkb is a similar query for  rkbexplorer

the only misuse i can see is the coref namespace doesn't have a trailing hash 
or slash

Original comment by K.J.W.Al...@gmail.com on 12 Nov 2009 at 4:00

GoogleCodeExporter commented 9 years ago

as of 2010-01-07 telco

Original comment by Michael.Hausenblas on 7 Jan 2010 at 12:06

Added labels: Milestone-Release2.0

GoogleCodeExporter commented 9 years ago

Original comment by Michael.Hausenblas on 18 Jan 2010 at 12:08

Added labels: Prodcut-vocab

GoogleCodeExporter commented 9 years ago

Original comment by Michael.Hausenblas on 18 Jan 2010 at 12:11

Added labels: Product-vocab

GoogleCodeExporter commented 9 years ago

Original comment by Michael.Hausenblas on 18 Jan 2010 at 12:12

Removed labels: Prodcut-vocab

GoogleCodeExporter commented 9 years ago

so, we agreed that we should say void:vocabulary should point to a namespace 
URI for
the vocabulary.

As far as I remember, we came to the conclusion that there is a basic lack of
consensus in practice of what URI to use to identify a vocabulary/ontology. 
Some 
ontologies don't use owl:Ontology, some don't use rdfs:isDefinedBy, many are
published at more than one location, etc etc.

For void:vocabulary to be useful for dataset selection, voiD authors need to use
canonical URIs for the vocabularies/ontologies they link to. I consider that 
the most
widely understood mechanism for deriving a canonical URI for a vocab is to use 
the
same mechanism as Qnames etc for stripping the local name off a vocabulary term 
URI.
eg: for http://www.w3.org/2002/07/owl#sameAs , http://www.w3.org/2002/07/owl#
    for http://xmlns.com/foaf/0.1/name , http://xmlns.com/foaf/0.1/

Original comment by K.J.W.Al...@gmail.com on 15 Apr 2010 at 1:12

GoogleCodeExporter commented 9 years ago

I agree for slash URIs but not for hash URIs. The problem is that 
<http://www.w3.org/2002/07/owl#> is not 
a resource that has any description. If you try to dereference it, you actually 
get 
<http://www.w3.org/2002/07/owl> because of hash stripping, so this URI actually 
represents an RDF 
document. If you look into that file, it furthermore states that 
<http://www.w3.org/2002/07/owl> is an 
owl:Ontology and various other metadata. It says nothing about 
<http://www.w3.org/2002/07/owl#>. 
Without having made a survey, I'd expect to see the same thing for most other 
hash namespaces. It's certainly 
what we implement in Neologism.

So if we say that the void:vocabulary should point to 
<http://www.w3.org/2002/07/owl>, then we actually 
end up with nicely linked data. If we say it should point to 
<http://www.w3.org/2002/07/owl#>, we just 
point at nothing.

It is true that just using the namespace URI (including hash or slash) would be 
slightly simpler in terms of 
specification and implementation, but removing the hash leads to an actual 
dereferenceable document that 
typically includes a helpful description of the document itself, and this 
tighter interlinking is worth the little 
bit of additional complexity IMO.

Therefore my proposed text:

Every value of void:vocabulary SHOULD be the namespace URI of a vocabulary or 
ontology that is used in the 
dataset. A vocabulary's namespace URI is the URI of any class or property in 
the vocabulary, with the local 
name stripped, that is, everything after the last "/" or "#" is removed. If the 
namespace URI ends in a "#", then 
this trailing hash is also removed; if it ends in a slash, the slash is kept.

Original comment by richard....@gmail.com on 5 May 2010 at 12:48

GoogleCodeExporter commented 9 years ago

I said: “Without having made a survey, I'd expect to see [the hash-less URI 
for owl:Ontology and rdfs:isDefinedBy] 
for most other hash namespaces.”

This may actually be wrong, so I retract that statement and will try to 
actually do a little survey.

Regardless of the survey's outcome, I believe that my proposal is the right 
choice, because it treats the 
vocabulary and the document it is defined in as the same thing and avoids the 
introduction of an unnecessary 
extra resource.

Original comment by richard....@gmail.com on 6 May 2010 at 7:47

GoogleCodeExporter commented 9 years ago

Survey results: http://groups.google.com/group/pedantic-web/msg/505c158813c9bff2

So my guess was actually right, only 20% of owl:Ontology statements and 20% of 
rdfs:isDefinedBy targets go to 
URIs ending in a hash.

Original comment by richard....@gmail.com on 6 May 2010 at 9:54

GoogleCodeExporter commented 9 years ago

good work richard!
I am enticed by your linked data argument, but neither of our proposals 
consistently
result in linked data. For instance, with SIOC, if you dereference the 
namespace URI
without the hash, you don't get any triples about that hashless URI, only
http://rdfs.org/sioc/ns#

So I guess we have to decide what is more important, linking to a URI you can 
find in
the dereferenced graph (which may not always be possible, but this needn't 
invalidate
the approach), or linking to "most canonical" URI.

Original comment by K.J.W.Al...@gmail.com on 7 May 2010 at 2:14

GoogleCodeExporter commented 9 years ago

As per my post to pedantic-web: SIOC predates httpRange-14, they didn't know 
what they were doing. Still, 
http://rdfs.org/sioc/ns is at least the identifier of a document that has a 
representation. 
http://rdfs.org/ns/void# is not the identifier of anything at all. Hence, 
stripping the hash is more linky and 
works for more vocabularies (47.5% vs 22.5%).

Original comment by richard....@gmail.com on 7 May 2010 at 5:02

GoogleCodeExporter commented 9 years ago

Per today's call, we resolved to adopt the text from comment 11, and update 1.7 
accordingly

Original comment by richard....@gmail.com on 15 Sep 2010 at 11:47

GoogleCodeExporter commented 9 years ago

Reviewed section 1.7 as per my action from last time and this is fine by me.

Original comment by Michael.Hausenblas on 15 Oct 2010 at 8:01

GoogleCodeExporter commented 9 years ago

looks good, except I am concerned that referring to the URI as "namespace URI" 
might be confusing/inaccurate, since in common parlance, "namespace URI" is 
steps 1 and 2, but not 3 - and Richard's argument for step 3 was that 
"namespace URI" is just a string you concatenate a local name to, not a URI you 
dereference, so what step 3 really  defines is a "canonical vocabulary document 
URI".

Original comment by K.J.W.Al...@gmail.com on 29 Oct 2010 at 8:57

GoogleCodeExporter commented 9 years ago

@Keith, in Revision 140 I changed the wording to this:

“Every value of void:vocabulary SHOULD be a URI that identifies a vocabulary 
or ontology that is used in the dataset. These URIs can be found as follows: 
…”

Does this addresses your comment?

Original comment by richard....@gmail.com on 29 Oct 2010 at 11:47

GoogleCodeExporter commented 9 years ago

yep

Original comment by K.J.W.Al...@gmail.com on 29 Oct 2010 at 11:52

GoogleCodeExporter commented 9 years ago

Great :-) Closing.

Original comment by richard....@gmail.com on 29 Oct 2010 at 3:59

Changed state: Fixed

cygri / void

Clarify use of void:vocabulary #45