owid / etl

A compute graph for loading and transforming OWID's data
https://docs.owid.io/projects/etl
MIT License
86 stars 23 forks source link

Serialise the output of `grapher://grapher` steps to disk #3608

Open larsyencken opened 15 hours ago

larsyencken commented 15 hours ago

Motivation

We would like to have alignment between our indicator and ETL APIs, however the flattening of dimensions causes a misalignment and means that there is nowhere in the ETL APIs that can give you the same data that's in a chart.

Proposal

We should serialise and publish the generated data frames for grapher://grapher steps, as well as shipping that data to MySQL. It might need a new channel.

larsyencken commented 15 hours ago

/cc @danyx23 @Marigold @pabloarosado

pabloarosado commented 15 hours ago

Thanks @larsyencken, just a quick note: By construction, the reason data://grapher steps exist (and that we don't publish garden steps directly) is to adapt our curated data to the neds of our grapher tool. So, if data://grapher are actually different from grapher://grapher steps, then they are not really fulfilling that promise, and therefore are a bit misleading (or redundant). The data team does not need to be aware of the technical difference between data://grapher and grapher://grapher steps. In fact, we usually just speak of "grapher steps". However, with the current implementation, data managers need to know that, for some reason, mdim steps depend on grapher://grapher steps (instead of the usual data://grapher). In my view, the ideal solution would be to have just one kind of grapher step (data://grapher). And grapher://grapher steps would simply be an implementation helper that data managers don't even need to be aware of.