Closed GoogleCodeExporter closed 9 years ago
The suggested convention for the data.gov.uk schemes is that for each
skos:ConceptScheme there is a corresponding owl:Class (subClassOf skos:Concept)
and
each concept is then both a skos:Concept in the ConceptScheme and is an
instance of
the class.
This seems to give us the best of both worlds and has worked OK on examples so
far.
It doesn't require any changes to the SDMX-RDF vocabulary, but would be worth
mentioning in the usage guidelines.
Original comment by Dave.e.R...@gmail.com
on 2 Apr 2010 at 7:05
Dave: Would the skos:ConceptScheme and owl:Class be the same resource? And
would you declare the class as
the rdfs:range for the dimension property in the DSD?
Original comment by richard....@gmail.com
on 15 Apr 2010 at 10:45
In the current data.gov.uk pattern the ConceptScheme and the Class are different
resources, a default naming convention is to have the class with leading upper
case
and scheme leading lower case.
Ian's example follows this pattern as does the COG Code List extraction. If you
look
at sdmx-code.ttl [1] then for each code lists there is a ConceptScheme, a Class
and a
set of Concepts.
And yes the class is then used as the rdfs:range of the dimension property. See
the
the bottom of sdmx-dimension.ttl [2] for the range declarations of the relevant
COG
derived properties.
[1]
http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/vocab
/sdmx-code.ttl
[2]
http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/vocab
/sdmx-dimension.ttl
Original comment by Dave.e.R...@gmail.com
on 15 Apr 2010 at 4:49
It turns out this issue touches on a number of requirements that we should make
explicit. I've started a Wiki page to capture those requirements
http://code.google.com/p/publishing-statistical-data/wiki/IssueCodeLists .
Using the
Wiki seemed appropriate so, after some mutual edits, we can end up at an agreed
statement of the problem.
In the meantime I'll post a summary and evaluation of my preferred approach and
variants as further comments in this issues thread.
Original comment by Dave.e.R...@gmail.com
on 22 Apr 2010 at 3:34
*Class assisted design*
In this design ...
The set of legal values for a CodedProperty (such as a dimension) is declared
as the
rdfs:range of the property, and is thus represented as an owl:Class (or
rdfs:Class
but that distinction is not important here).
As a common and useful convention then code lists such as the COG code lists
should
be represented as skos:ConceptSchemes but in addition for each scheme there
should be
an associated owl:Class, a subclass of skos:Concept. Each code is a member of
this
class (and thus a skos:Concept) as well as being linked into a
skos:broader/narrower
hierarchy rooted at the skos:ConceptScheme.
However, it is also possible to reuse existing classes to denote the range of
values.
For example, a dataset giving statistics on UK schools would declare
school-ont:School as the range of the school dimension. There is no need to
create
parallel concepts (whether SKOS or SDMX) to represent the set of schools.
The current COG code translation already supports this approach.
There are a couple of variants on this approach according to what we do with
sdmx:codeList:
(i) Drop sdmx:codeList as redundant, only use the rdfs:range declaration to
link from
the Dimension declaration to the class representing the codes.
(ii) Retain sdmx:codeList to point to the skos:ConceptScheme for those cases
where a
SKOS scheme has been declared, i.e. it becomes optional.
* Evaluation *
Validation. OK. Since this follows standard RDFS practice then existing (closed
world) validation tools such as Eyeball can be used. Custom checkers based on
SPARQL
or rule engines simply need to look up the type - if [eg:dim rdfs:range T] and
you
want to validate [eg:observation eg:dim X] you just check for [X rdf:type T]
associated with X. This does not preclude some tools using inference machinery
to
infer the class membership but does not require it.
COG. Easy, we've already translated the COG code lists to a compatible format.
Reuse. Easy so long as the external URI set provides an owl:Class for the set.
This
is true of the concrete use cases we have in data.gov.uk and a reasonable
requirement
in a linked data world.
Enumeration. Mixed. If you have a dataset describing all of the members of the
class
then retrieving those members is trivial. In the case of the pattern of using
both
skos:ConceptScheme and associated class then the member information is provided
as
part of the concept scheme definition. For externally maintained schemes such
as the
Schools URI Set then you may need to find the SPARQL endpoint or API for the
URI Set,
the linking patterns to get you from the ontology declaration to the associated
URI
Set are still evolving.
SDMX Compatibility. Reasonable. Translating from SDMX code lists using the SKOS
pattern is straight forward. Variant (ii) could make this more explicit at the
cost
of some redundancy.
Simplicity. Seems OK to me. In the case where you are defining a SKOS concept
scheme
then there is a little extra work (one class plus one triple for each concept)
but it
is a straightforward pattern to apply.
[1]
http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/vocab
/sdmx-code.ttl
Original comment by Dave.e.R...@gmail.com
on 22 Apr 2010 at 4:59
We discussed this in the call on 6th May and reached consensus. Dave and
Richard took
actions to update documentations and specs accordingly.
Original comment by i.j.dick...@gmail.com
on 7 May 2010 at 9:12
Closing now the documentation has been updated.
Original comment by Dave.e.R...@gmail.com
on 16 Aug 2010 at 11:26
Original issue reported on code.google.com by
richard....@gmail.com
on 31 Mar 2010 at 3:34