Swirrl / cubiql

CubiQL: A GraphQL service for querying multidimensional Linked Data Cubes
Eclipse Public License 1.0
41 stars 2 forks source link

Aggregations should separate "ALL" values #63

Open zeginis opened 6 years ago

zeginis commented 6 years ago

e.g. the dataset_poverty has the dimension POPULATION_GROUP with values: CHILDREN, PENSIONERS, WORKING_AGE_ADULTS and ALL

When computing the aggregations the ALL value should not be used. Othewise the result will be incorrect. E.g. the following SUM will be incorrect.

{dataset_poverty{
  observations(dimensions:{reference_period:"2010"}){
    aggregations{sum(measure:COUNT)}
}}}
RickMoynihan commented 6 years ago

I think this is the expected behaviour.

The only way I can see to resolve this is to model the overlap of dimensions and target that.

Do you know of any vocabularies that do this?

zeginis commented 6 years ago

In this case the aggregation is wrong since male+female=total. The aggregation will return 2*total

This is discussed at Challenge 9.3 of the application profile (https://islab-uom.github.io/qbBestPractices/#definingCodeLists). A solution could be to use a hierarchy and define total values at the top of the hierarchy, but this is still an open issue.

RickMoynihan commented 6 years ago

Yes I understand the problem; it's just not really CubiQL's job to ensure you don't ask stupid questions and get stupid answers.

It's a modelling problem, which is why I'm wondering if there are any ontologies that let you describe MECE style properties of dimensions etc. If we can describe this stuff then we can in principle support it.

There are likely other similar issues such as letting people sum ratio's etc...

zeginis commented 6 years ago

Another thing that came in my mind is the aggregation of hierarchical data. In this case CubiQL should not mix all the hierarchical levels together.

RickMoynihan commented 6 years ago

Yes, that's another bad one. Again a modelling vocab/issue though. Are we planning to propose a vocabulary for representing these things? Feels like we need something like owl:disjointClass like an appprofile:disjointSummable.