Letractively / publishing-statistical-data

Automatically exported from code.google.com/p/publishing-statistical-data
0 stars 0 forks source link

Vocabulary terms for annotations #3

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
An important element of many real-world applications is the use of 
annotations to record important information about observations, groups, time-
series and datasets. We may need to add vocabulary terms for annotating data, 
and we certainly need to document the design patterns

Original issue reported on code.google.com by i.j.dick...@gmail.com on 23 Mar 2010 at 11:08

GoogleCodeExporter commented 9 years ago
This is needed for the Communities and Local Government data.

Possible vocabulary to reuse includes:
  - Nepomuk Annotation vocabulary (not suitable, doesn't provide for annotation metadata)
  - skos:note  (open ended enough to do anything, no domain restrictions so is reusable)

Suggest we have a simple sdmx:annotation whose range is sdmx:Annotation which is
typically a blank node with DCTerms etc on. That way we can (somewhat) formally
document expected properties of sdmx:Annotation if we want to. 

Original comment by Dave.e.R...@gmail.com on 25 Mar 2010 at 12:12

GoogleCodeExporter commented 9 years ago
Given the use case of making it easier to allow the statisticians to denote 
important
annotations that change the interpretation of the statistics (e.g. the "we 
changed
the collection method in 2006 so don't compare these figures to 2005"), I 
wondered if
we should have an annotation class with stronger perlocutionary force than just
"here's an annotation". sdmx:caveat, for example.

Original comment by i.j.dick...@gmail.com on 25 Mar 2010 at 12:26

GoogleCodeExporter commented 9 years ago
We agreed to add some basic vocabulary for annotations, derived from skos:note, 
but
not to try to anticipate any more structured applications for annotations in 
advance
of detailed use cases.

Original comment by i.j.dick...@gmail.com on 25 Mar 2010 at 2:11

GoogleCodeExporter commented 9 years ago

Original comment by i.j.dick...@gmail.com on 25 Mar 2010 at 2:11

GoogleCodeExporter commented 9 years ago
In addition to the agreed skos:note/sdmx:Annotation then it would be use to have
properties for the last update time and who did that update.  This is needed 
for the
LDEx example I'm looking at but seems like a common concept for statistical 
data set.

Suggest adding:
   sdmx:lastUpdatedBy  rdfs:subClassOf  dct:contributor .
   sdms:lastUpdated    rdfs:subClassOf  dct:date .

Original comment by Dave.e.R...@gmail.com on 4 Apr 2010 at 3:03

GoogleCodeExporter commented 9 years ago
Have looking closer at the Content Oriented Guidelines I found I'd overlooked
DATA_UPDATE. So in fact the lastUpdated case can handled by the COG-derived 
builtin
property:

   sdmx-attribute:dataUpdate
       a sdmx:AttributeProperty;
       sdmx:concept  sdmx-concept:dataUpdate ;
       rdfs:range xs:dateTime .

However, I can't find an equivalent to lastUpdatedBy in the COG so still 
suggest we add:

   sdmx-attribute:dataLastUpdatedBy
       a sdmx:AttributeProperty ;
       rdfs:subClassOf  dct:contributor .

Original comment by Dave.e.R...@gmail.com on 4 Apr 2010 at 3:27

GoogleCodeExporter commented 9 years ago
I added the dataLastUpdatedBy property to the sdmx.ttl for now, though I'm not 
sure
this should be its final home. Putting it in sdmx-attribute.ttl might be more
appropriate, although that file is auto-generated from the XML, so there's an
argument for not having manual inclusions in that file. Thoughts?

Original comment by i.j.dick...@gmail.com on 7 Apr 2010 at 10:58

GoogleCodeExporter commented 9 years ago
Neither sdmx.ttl nor sdmx-attribute.ttl are the right place to dump attributes 
or concepts that you require for a 
particular dataset.

sdmx.ttl should be for stuff that's defined in SDMX. sdmx-attribute.ttl should 
be for stuff that's defined in the 
COGs. If a particular DSD requires something else, then this should be defined 
in the DSD's namespace or any 
other namespace.

SDMX provides many extension points, and RDF provides URI-based extensibility, 
so let's use them.

Original comment by richard....@gmail.com on 8 Apr 2010 at 11:37

GoogleCodeExporter commented 9 years ago
Of course domain specific attributes should go in dataset-specific 
vocabularies, that
goes without saying. This issue thread is about generic cross-domain concepts 
for
annotations. Dave observed (comment 6) that there is an obvious omission from 
the
SDMX-provided cross-domain concepts. I accept that there's an argument for
dataLastUpdatedBy not being in sdmx.ttl or sdmx-attributes.ttl, but what's the
alternative? For a single, omitted, cross-domain attribute?

Ian

Original comment by i.j.dick...@gmail.com on 8 Apr 2010 at 11:42

GoogleCodeExporter commented 9 years ago
The question what's an obvious omission from SDMX's cross-domain concept list 
and what's a domain-
specific concept is rather subjective. I propose we find a solution that allows 
us to avoid that kind of question.

Your concrete problem can be resolved by putting this concept alongside the 
domain-specific ones that you 
have to create anyway.

The definition of a general set of re-usable concepts that revise or complement 
the SDMX COGs is not 
required for the SDMX-RDF effort. This job is better left to the statistics 
offices that will hopefully start to use 
SDMX-RDF.

Original comment by richard....@gmail.com on 8 Apr 2010 at 5:30

GoogleCodeExporter commented 9 years ago
I think we can close this issue now: we have a minimum of core vocabulary in 
sdmx.ttl
and in the COG translation. Additional annotations can be defined in 
domain-specific
vocabularies.

Original comment by i.j.dick...@gmail.com on 15 Apr 2010 at 8:39