open-numbers / ddf--gapminder--systema_globalis

Gapminder's fact-base with local & global statistics
99 stars 82 forks source link

Missing drillup and drilldown in dimensions.csv TIME #11

Closed jheeffer closed 8 years ago

jheeffer commented 8 years ago

If I understand DDF correctly we're missing:

drillup from world_4region to global drilldown from world_4region to country drilldown from global to country drilldown from global to un_state

olarosling commented 8 years ago

nr 2 "drilldown from world_4region to country" excists in ddf--list--geo--country.csv The global to region I will add, Global to country & un_state I'll remove as it can be "derived"( inherited) down through the reigons. Every geo shouldn't have to be explicitly tagged as being a child of the world,

jheeffer commented 8 years ago

in ddf--list--geo--country.csv there's only drill up to region right? I'm talking in dimensions.csv there's no drilldown from region to country specified. But you said before you might want to remove drilldown completely because if there's drillup one way there's a drilldown the other way.

jheeffer commented 8 years ago

And if drilling can be inherited/derived. How does that work for time? Days should drill up to week. Probably also to month because week -> month does not overlap perfectly. But days shouldn't have to drill up to quarters or years right, because they're linked by derivation.

Also, how does a week drill up to month? Using the ISO definition for week -> year (the first week of the year is the week with the year's first Thursday in it -> the first week of the month is the week with the month's first Thursday in it?). And if it does, then days drilling up to week and then by derivation to month, you get inconsistent drillups.

2 january 2016 is in week 53 2015 week 53 2015 is in december 2015 2 january 2016 is in december 2015 (day -> week -> month) 2 january 2016 is in january 2016 (day -> month)

Not sure how to fix this yet. Days should be able to drill to weeks. Week should be able to drill to months. But the derivation is where things get awkward.

olarosling commented 8 years ago

I suggest we have a canonical dimension type called "universal_time_interval" which is specialized at this VERY common problem, and we avoid defining it as a dataset of it's own.

jheeffer commented 8 years ago

Still missing drilldown from region to country in dimensions.csv. Now there's only un_state

olarosling commented 8 years ago

I see!