zazuko / cube-link

Cube Schema
https://zazuko.github.io/cube-link/
Other
12 stars 8 forks source link

Dimension attribute: hierarchy/aggregation level #35

Open jstcki opened 3 years ago

jstcki commented 3 years ago

Not sure how this needs to be modeled properly, so it's predictable. Attaching a skos:broader to each dimension value probably will work, but how can a tool know that this hierarchy exists in the data in the first place?

Assuming we have flat data with mixed hierarchy levels like

geo value partOf aggregation
Schweiz 5.0 mean
Zürich 6.0 Schweiz
Bern 4.0 Schweiz

A tool might display it as

… or just show values on level "Schweiz" etc.

Cf.

ktk commented 3 years ago
jstcki commented 3 years ago

@ktk one thing that came up in discussion with Fabian: when a dimension is part of a hierarchy, how should the different levels it be linked to the observation?

Option A

Only link the lowest level instance to the observation (e.g. a municipality), and link from there to higher levels -> canton -> country etc.

This is probably more "pure" and manageable in terms of the data model. Less triples needed. But resolving to higher levels and flattening them to a table is up to the client, so the hierarchy "shape" needs to be known to avoid having to walk all observations. Querying and building filters is more complex.

Option B

Keep all relevant levels linked to an observation (i.e. more tabular structure). E.g. having a municipality, canton, country dimension on each observation. Instances of municipalities and cantons etc. could still be linked but consumers don't have to know or respect this. Easier to query/filter for clients because data is already tabular. Would potentially need a lot more triples.

FabianCretton commented 3 years ago

About Option B, where data is "duplicated" to facilitate querying, to be noted that this could be part of information "materialization", where the data model is A and B is "inferred" by a reasoner.

Because maintaining Option B would be very cumbersome (correct all the triples when the hierarchy evolves)

ktk commented 3 years ago

That will definitely be option A. The library can figure out how it's nested so it could potentially support the developer/user.

For plain SPARQL, you will have to understand how it is done and either use property paths or classic joins.

FabianCretton commented 3 years ago

@herrstucki please not that in the current RDF version of the RedLists, I did implement both, for testing purpose and as I was not sure yet. In the next version, as this was validated by Adrian yesterday, I will remove the direct triples (i.e remove option B and keep option A only)

jstcki commented 3 years ago

Closing because described in the README: https://github.com/zazuko/rdf-cube-schema-viz#nested-hierarchies

l00mi commented 2 years ago

We will discuss further details on this topic in here. @rdataflow

martinmaillard commented 2 years ago

Just a note about the solution described in the README: I don't think it's valid to put sh:path on a sh:NodeShape (https://www.w3.org/TR/shacl/#syntax-rule-path-node).