owid / etl

A compute graph for loading and transforming OWID's data
https://docs.owid.io/projects/etl
MIT License
58 stars 18 forks source link

:bug: fix metadata checksum #2797

Closed Marigold closed 3 weeks ago

Marigold commented 3 weeks ago

Metadata checksum should be as consistent as possible. There are problematic fields like updatedAt that can change even though it has no effect on metadata and should be excluded from checksum calculation.

This PR adds a couple more fields that cause problems

TODO after merging:

owidbot commented 3 weeks ago
Quick links (staging server): Site Admin Wizard

Login: ssh owid@staging-site-fix-metadata-checksum

chart-diff: ✅ No charts for review.
data-diff: ✅ No differences found ```diff Legend: +New ~Modified -Removed =Identical Details Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet ``` Automatically updated datasets matching _weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk_ are not included

Edited: 2024-06-12 13:43:56 UTC Execution time: 12.64 seconds

lucasrodes commented 3 weeks ago

dimensions - these are not causing problems, but are already captured in dataChecksum

Would they? For instance, if we use indicator upgrader and change the use of an indicator for another, an ID in dimensions will be changed. This should captured as a 'config change'. I just wonder, would such changes be captured by checksum and config?

Marigold commented 3 weeks ago

Sorry, I don't think I got it, but let me try anyway :).

For instance, if we use indicator upgrader and change the use of an indicator for another, an ID in dimensions will be changed.

If you change an indicator (or did you want to say entity?) then grapher config will change (which is the most important). dimensions are computed directly from data. The thing is that if you change an entity name, with dimensions it'd show up as both data change and metadata change.

lucasrodes commented 3 weeks ago

ah, nvm, I got confused. I was thinking of chart.config.dimensions! All clear sir!