DataBrewery / cubes

[NOT MAINTAINED] Light-weight Python OLAP framework for multi-dimensional data analysis
http://cubes.databrewery.org
Other
1.49k stars 313 forks source link

consider support for W3C Data Cube (QB) RDF #442

Open VladimirAlexiev opened 7 years ago

VladimirAlexiev commented 7 years ago

Is anyone else interested in Cubes support for the W3C Data Cube (QB) RDF data model https://www.w3.org/TR/vocab-data-cube/? I am particularly interested in supporting W3C Cube in CubesViewer, but as https://github.com/jjmontesl/cubesviewer/issues/70 shows, the preference is to add such support in Cubes, rather than the presentation layer (CubesViewer).

Cheers!

jjmontesl commented 5 years ago

Are you still interested in this? I'm thinking on adding multiple backend support to CubesViewer.

VladimirAlexiev commented 5 years ago

Still interested, and we may be able to contribute some development through the BigDataGrapes project. It's about agricultural observations data that will use both QB and Geospatial components (similar to QB for Earth Observation, https://www.w3.org/TR/eo-qb/).

VladimirAlexiev commented 5 years ago

@jjmontesl wrote: I have similar projects: ETL tools and stuff built to import Spanish and Eurostat data. It'd be nice to talk about that at some point.

QB has been widely used to represent statistical data. Excerpt from a report I wrote : QB incorporates an OLAP data model and statistical classifications following SDMX. There is a number of statistical datasets available as RDF, including:

It lists two QB viewers, and then CubeViewer, which is much more powerful than those.

VladimirAlexiev commented 5 years ago

QB https://www.w3.org/TR/vocab-data-cube/ includes powerful OLAP metadata called qb:DataStructureDefinition and qb:SliceKey. The overall structure of the ontology is this:

VladimirAlexiev commented 5 years ago

@jjmontesl thinking on adding multiple backend support to CubesViewer

From the other discussion it seems that Cubes has multiple backend support but maybe it's not the best way forward? Seems you've already determined that the better engineering approach is to add such support to CubesViewer instead?

jjmontesl commented 5 years ago

The reason I mentioned that is because CubesViewer needs multiple backend support anyway as I'd wish to add a local CSV/Tabularin any case, and perhapsp MDX in the future. From that point of view, it eases the path to integrate W3C Cube RDF.

But I'm not quite sure of the better engineering approach to this. On one hand, I'm not sure exactly I understand what you have in mind in terms of supporting QB. I have no experience with it and I'm not even sure what the possibilities are (I understand CubesViewer can consume W3C Cube schemas, but I'm not sure about how data is published/consumed).

VladimirAlexiev commented 5 years ago

@jjmontesl http://estatwrap.ontologycentral.com/page/ei_bsco_m is an example RDF QB dataset. At the bottom there are links for viewing it as a table, and downloading it as:

If you look at the data, it consists of a bunch of nodes like this:

[ a                      qb:Observation ;
  estat:geo              <dic/geo#SE> ;
  estat:indic            <dic/indic#BS-PT-NY> ;
  estat:s_adj            <dic/s_adj#SA> ;
  estat:unit             <dic/unit#BAL> ;
  dcterms:date           "2016-11" ;
  qb:dataSet             <id/ei_bsco_m#ds> ;
  sdmx-measure:obsValue  27.7
] .

All dimensions and the measure are present in each observation. On the other hand the attributes freq obsStatus timeformat are optional, and are not present.