owid / etl

A compute graph for loading and transforming OWID's data
https://docs.owid.io/projects/etl
MIT License
58 stars 18 forks source link

wizard: improve new data workflow #2580

Open lucasrodes opened 1 month ago

lucasrodes commented 1 month ago

One-liner

Improvements to the new data update workflow.

Context & details

At high level, the new workflow is:

  1. Staging preparation
    • Create branch, PR
    • Data work
    • Push changes
  2. Indicator upgrader
    • User maps indicators to their new versions (as much as possible should be automated).
    • Charts are updated using the new indicators.
  3. Chart diff: on-going revision of changes in charts
    • Show all charts that have changed in staging. Changes can be due to: (i) indicator changes, (ii) changes in chart config (from admin), (iii) changes in indicator values (data work in ETL) and (iv) changes in indicator metadata (data work in ETL).
    • When all charts are approved, owidbot should signal this in its comment in the PR
    • Once all looks fine, merge PR.
  4. After PR merge
    • Charts are automatically updates in production.
    • If there is any conflict, a manual revision is needed. Maybe use old chart approval here?

Note that there can be data work at any point between steps 1 and 2, hence chart-diff should also reflect chart changes due to

Future work

Indicator Upgrade

Chart diff

Docs

Detected bugs

PR list

pabloarosado commented 1 month ago

This plan sounds good! I just created a PR with a tool I was playing around with to be able to detect automatically the grapher dataset mapping old-> new based on your current branch. I think some of that logic can be implemented in indicator upgrader (I'm happy to do that myself down the line).

Also, some other part of the logic of that tool (finding all possible files affected by your local work) could also be part of chart diff, but that's just nice to have (for now chart diff should relay on chart config differences).

Marigold commented 1 month ago

I just finished an update of WDI with about 500 charts. It was pretty smooth, but not perfect. Some observations:

pabloarosado commented 1 month ago

Minor thing: I think it would be convenient to show if a chart is a draft in chart diff.

Marigold commented 3 weeks ago

Two nice to have ideas from doing large GBD review:

Is there an easier way? Maybe we could have a "reset" button for a chart that would reset config from production, but keep variable ids. Not sure how hard it would be to implement.

paarriagadap commented 3 weeks ago

Regarding the first point @Marigold made above, an alternative implementation could be to use the test page, in the format http://staging-site-{branch}/admin/test/embeds?ids={chart-id}&comparisonUrl=https%3A%2F%2Fourworldindata.org

I used that in a PIP issue: image

lucasrodes commented 3 weeks ago

[this list has been integrated into main issue list]

more points: