OpenGovIntelligence / json-qb

A JSON API for accessing data in RDF Data Cube format
Other
5 stars 1 forks source link

Definition of dimension values #14

Open zeginis opened 7 years ago

zeginis commented 7 years ago

At the structure of the table are defined the "all_dimension_values".

In my opinion we need the dimension values only for the free dimensions. Why do we need the dimension values of the locked dimensions?

In the case we include only the values of the free dimensions then they could be merged with the headings. This means that the headings contain values like:

"headings": {"refArea": {"S12000005": {"@id": "http://statistics.gov.scot/id/statistical-geography/S12000005", "label":"Clackmannanshire"}, "S12000042": {"@id": "http://statistics.gov.scot/id/statistical-geography/S12000042", "label":"Dundee City"},....}

RickMoynihan commented 7 years ago

Hi @zeginis good questions... keep em coming :-)

The reason to provide dimension values for locked dimensions is so you can generate an interface like this during a drill down, where the application guides you into always ending at a valid slice table.

There is definitely some overlap in purpose between headings and free_dimensions, however the information in headings is really there to express the sort order for the headers, as defined in the RDF by something like ui:sortPriority.

Basically HashMaps or Objects in Javascript and other languages are usually un-ordered by default, therefore it's necessary to use Arrays to guarantee an ordering independent of any implementation.

zeginis commented 7 years ago

Hi @RickMoynihan,

RickMoynihan commented 7 years ago

Yes, we might well push the dimension-values data out onto a different API route.

Regarding syntax for headings, that is another alternative and one I did consider. Largely this stuff is down to personal preference with no clear reason to prefer one to the other. My reasoning for preferring just listing the keys was that I think it made the intention more obvious, and you can just lookup the definitions elsewhere. This does partly depend on API services generating good unique keys for the maps, from URI slugs etc...

Another reason to prefer listing just the keys though is that you can save some duplication, if we're also listing dimension-values (either under a separate object or route).

I agree that dimension values might not all be included due to pagination. We're assuming that the API will however return you the complete set of values for one axis (i.e. we're only planning to paginate in one dimension).

We were also thinking the other day that we should completely remove blank "rows" from the paginated axis so we can roll-up tables preventing people from having to page through hundreds of blank pages before finding the actual data at the bottom.