zazuko / query-rdf-data-cube

Explore or query RDF Data Cubes with a JavaScript API, without writing SPARQL.
https://zazuko.github.io/query-rdf-data-cube/
9 stars 2 forks source link

DataSet/Component metadata #16

Open jstcki opened 5 years ago

jstcki commented 5 years ago

With the datasets we're using from https://trifid-lindas.test.cluster.ldbar.ch/sparql/ we (eventually) expect that datasets and components will have more metadata attached.

For example:

Is the idea that this library will reflect an opinionated structure (i.e. a more narrow definition of what a data cube should contain) and provide this metadata by default? Or is the goal to stay generic and users have to somehow query this metadata in addition to what's returned when e.g. they call dataset.dimensions()

Note that I don't have a good understanding what the actual metadata for our use case will look like because I haven't seen a dataset with comprehensive metadata yet.

ktk commented 5 years ago

While it is generic now I think we will (have to) make it more opinionated in the future. In my opinion it is currently not possible to reflect everything in the existing RDF Data Cube spec, for that reason I propose we collect a list of all we need and come up with a vocabulary which allows us to express this.

You already pointed me to some things in the PC AXIS spec, I will discuss with my colleagues on how we approach that and I will followup here soon.

jstcki commented 5 years ago

Top priority for me is dimension types because we need to be able to distinguish which chart types are compatible with the datasets.

The absolute basics being:

It sounds like these can be expressed by qb:concept, as described in https://www.w3.org/TR/vocab-data-cube/#dsd-cog

For domain/range, it sounds like rdfs:range could be appropriate.

One example from the Data Cube spec:

eg:refPeriod  a rdf:Property, qb:DimensionProperty;
    rdfs:label "reference period"@en;
    rdfs:subPropertyOf sdmx-dimension:refPeriod;
    rdfs:range interval:Interval;
    qb:concept sdmx-concept:refPeriod . 

We can of course work around this by fetching observations and inferring these but that seems like a terrible approach.

All "descriptive" metadata like descriptions, contact info is definitely less of a priority (but also doesn't seem too complicated if we decide that we only care about the DataSet level).

jstcki commented 5 years ago

So if – for starters – there was a way of including the (optional) qb:concepts of dimensions/attributes/measures, that should still be very generic and cover our needs pretty well.

I think there's no special need for this library to "know" what those concepts are (i.e. no dedicated SDMXTimePeriod classes needed) but just attaching them as IRIs or as a generic Concept class would be fine.

And something similar would probably work for rdfs:range.

ktk commented 4 years ago

Depends on:

vhf commented 4 years ago

@lucguillemot I'm done with #16, could you please try 0.2.1 and give feedback?

lucguillemot commented 4 years ago

It looks good! You added custom metadata as a datacube argument so that the library can remain generic, right? I added the description and the source to our application. So far so good. Thank you for adding this feature!

vhf commented 4 years ago

You added custom metadata as a datacube argument so that the library can remain generic, right?

Exactly. Glad it works for you, next step on this front is doing something similar for dimensions/measures metadata.

lucguillemot commented 4 years ago

Sounds good!