owid / etl

A compute graph for loading and transforming OWID's data
https://docs.owid.io/projects/etl
MIT License
58 stars 18 forks source link

🔨 load_dependency: remove #2884

Open lucasrodes opened 6 days ago

lucasrodes commented 6 days ago

Tasks

Other additions

Notes

To replace load_dependencyload_dataset

To replace load_dependencyload_snapshot

Additional notes

owidbot commented 6 days ago
Quick links (staging server): Site Admin Wizard

Login: ssh owid@staging-site-load-dependency

chart-diff: ✅ No charts for review.
data-diff: ❌ Found differences ```diff = Dataset garden/biodiversity/2024-01-25/cherry_blossom = Table cherry_blossom = Dataset garden/growth/2024-05-16/gdp_historical = Table gdp_historical = Dataset garden/neglected_tropical_diseases/2024-05-18/funding = Table funding_disease_product = Table funding_product_ntd = Table funding_product = Table funding_disease = Dataset garden/tuberculosis/2023-11-27/budget = Table budget = Dataset garden/tuberculosis/2023-11-27/burden_disaggregated = Table burden_disaggregated_rate ~ Column best_rate (changed data) ~ Changed values: 33 / 7458 (0.44%) country year age_group sex risk_factor best_rate - best_rate + Upper-middle-income countries 2022 0-14 m all 18.588425 18.472746 Upper-middle-income countries 2022 15plus m all 97.678978 97.466286 Upper-middle-income countries 2022 35-44 m all 83.870865 83.765709 Upper-middle-income countries 2022 45-54 m all 98.181702 97.823776 Upper-middle-income countries 2022 55-64 a all 84.095490 83.838737 ~ Column high_rate (changed data) ~ Changed values: 33 / 7458 (0.44%) country year age_group sex risk_factor high_rate - high_rate + Upper-middle-income countries 2022 0-14 m all 25.394789 25.240131 Upper-middle-income countries 2022 15plus m all 129.974899 129.779236 Upper-middle-income countries 2022 35-44 m all 158.373505 158.145920 Upper-middle-income countries 2022 45-54 m all 171.940369 171.372421 Upper-middle-income countries 2022 55-64 a all 142.691849 142.320328 ~ Column low_rate (changed data) ~ Changed values: 33 / 7458 (0.44%) country year age_group sex risk_factor low_rate - low_rate + Upper-middle-income countries 2022 0-14 m all 11.675552 11.600473 Upper-middle-income countries 2022 15plus m all 65.720535 65.546577 Upper-middle-income countries 2022 35-44 m all 21.176729 21.071686 Upper-middle-income countries 2022 45-54 m all 32.337032 32.125237 Upper-middle-income countries 2022 55-64 a all 30.276695 30.073187 = Table burden_disaggregated ~ Column best (changed data) ~ Changed values: 41 / 20181 (0.20%) country year age_group sex risk_factor best - best + Upper-middle-income countries 2022 0-4 f all 20717 20877 Upper-middle-income countries 2022 15plus a all 1491815 1503815 Upper-middle-income countries 2022 45-54 f all 75889 76529 Upper-middle-income countries 2022 5-14 m all 23228 23488 Upper-middle-income countries 2022 65plus m all 220788 221588 ~ Column hi (changed data) ~ Changed values: 41 / 20181 (0.20%) country year age_group sex risk_factor hi - hi + Upper-middle-income countries 2022 0-4 f all 39579 39869 Upper-middle-income countries 2022 15plus a all 1863271 1879271 Upper-middle-income countries 2022 45-54 f all 138068 139268 Upper-middle-income countries 2022 5-14 m all 42779 43249 Upper-middle-income countries 2022 65plus m all 355687 357187 ~ Column lo (changed data) ~ Changed values: 41 / 20181 (0.20%) country year age_group sex risk_factor lo - lo + Upper-middle-income countries 2022 0-4 f all 5029 5051 Upper-middle-income countries 2022 15plus a all 1122830 1131730 Upper-middle-income countries 2022 45-54 f all 22301 22391 Upper-middle-income countries 2022 5-14 m all 6860 6896 Upper-middle-income countries 2022 65plus m all 91530 91640 = Dataset garden/tuberculosis/2023-11-27/burden_estimates = Table burden_estimates = Dataset garden/tuberculosis/2023-11-27/drug_resistance_surveillance = Table drug_resistance_surveillance = Dataset garden/tuberculosis/2023-11-27/laboratories = Table laboratories = Dataset garden/tuberculosis/2023-11-27/notifications = Table notifications = Dataset garden/tuberculosis/2023-11-27/outcomes_disagg = Table outcomes_disagg = Dataset garden/un/2024-03-14/un_wpp_most = Table population_5_year_age_groups = Table population_10_year_age_groups = Dataset garden/wash/2024-01-06/who = Table who Legend: +New ~Modified -Removed =Identical Details Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet ``` Automatically updated datasets matching _weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk_ are not included

Edited: 2024-06-28 16:32:54 UTC Execution time: 20.97 seconds

lucasrodes commented 4 days ago

@spoonerf As I am removing deprecated code, I found that this step is failing:

garden/tuberculosis/2023-11-27/burden_disaggregated

due to duplicate indices. Is this expected?

spoonerf commented 3 days ago

@lucasrodes - not expected, I'll try and take a look today.