owid / etl

A compute graph for loading and transforming OWID's data
https://docs.owid.io/projects/etl
MIT License
87 stars 23 forks source link

📊 : Add UNSD subregions to regions dataset #3507

Closed antea04 closed 3 weeks ago

antea04 commented 3 weeks ago

Adds the geographic sub-regions by the UN Statistics to our regions data set.

I'm going to use this data to calculate an indicator for migration within and between subregions for Simons upcoming article - but I anticipate that these subregions are useful for other indicators as well.

Reference to the discussion in Slack.

owidbot commented 3 weeks ago
Quick links (staging server): Site Admin Wizard Docs

Login: ssh owid@staging-site-unsd-subregions

chart-diff: ✅ No charts for review.
data-diff: ❌ Found differences ```diff = Dataset garden/regions/2023-01-01/regions = Table regions ~ Dim code + + New values: 22 / 334 (6.59%) code UNSD_CAM UNSD_NAF UNSD_POL UNSD_SAF UNSD_WAS ~ Column aliases (new data) + + New values: 22 / 334 (6.59%) code aliases UNSD_CAM NaN UNSD_NAF NaN UNSD_POL NaN UNSD_SAF NaN UNSD_WAS NaN ~ Column cow_code (new data) + + New values: 22 / 334 (6.59%) code cow_code UNSD_CAM UNSD_NAF UNSD_POL UNSD_SAF UNSD_WAS ~ Column cow_letter (new data) + + New values: 22 / 334 (6.59%) code cow_letter UNSD_CAM NaN UNSD_NAF NaN UNSD_POL NaN UNSD_SAF NaN UNSD_WAS NaN ~ Column defined_by (new data) + + New values: 22 / 334 (6.59%) code defined_by UNSD_CAM unsd UNSD_NAF unsd UNSD_POL unsd UNSD_SAF unsd UNSD_WAS unsd ~ Column end_year (new data) + + New values: 22 / 334 (6.59%) code end_year UNSD_CAM UNSD_NAF UNSD_POL UNSD_SAF UNSD_WAS ~ Column imf_code (new data) + + New values: 22 / 334 (6.59%) code imf_code UNSD_CAM UNSD_NAF UNSD_POL UNSD_SAF UNSD_WAS ~ Column is_historical (new data) + + New values: 22 / 334 (6.59%) code is_historical UNSD_CAM False UNSD_NAF False UNSD_POL False UNSD_SAF False UNSD_WAS False ~ Column iso_alpha2 (new data) + + New values: 22 / 334 (6.59%) code iso_alpha2 UNSD_CAM NaN UNSD_NAF NaN UNSD_POL NaN UNSD_SAF NaN UNSD_WAS NaN ~ Column iso_alpha3 (new data) + + New values: 22 / 334 (6.59%) code iso_alpha3 UNSD_CAM NaN UNSD_NAF NaN UNSD_POL NaN UNSD_SAF NaN UNSD_WAS NaN ~ Column kansas_code (new data) + + New values: 22 / 334 (6.59%) code kansas_code UNSD_CAM NaN UNSD_NAF NaN UNSD_POL NaN UNSD_SAF NaN UNSD_WAS NaN ~ Column legacy_country_id (new data) + + New values: 22 / 334 (6.59%) code legacy_country_id UNSD_CAM UNSD_NAF UNSD_POL UNSD_SAF UNSD_WAS ~ Column legacy_entity_id (new data) + + New values: 22 / 334 (6.59%) code legacy_entity_id UNSD_CAM UNSD_NAF UNSD_POL UNSD_SAF UNSD_WAS ~ Column marc_code (new data) + + New values: 22 / 334 (6.59%) code marc_code UNSD_CAM NaN UNSD_NAF NaN UNSD_POL NaN UNSD_SAF NaN UNSD_WAS NaN ~ Column members (new data) + + New values: 22 / 334 (6.59%) code members UNSD_CAM ["BLZ", "CRI", "SLV", "GTM", "HND", "MEX", "NIC", "PAN"] UNSD_NAF ["DZA", "EGY", "LBY", "MAR", "SDN", "TUN", "ESH"] UNSD_POL ["ASM", "COK", "PYF", "NIU", "PCN", "WSM", "TKL", "TON", "TUV", "WLF"] UNSD_SAF ["BWA", "LSO", "NAM", "ZAF", "SWZ"] UNSD_WAS ["ARM", "AZE", "BHR", "CYP", "GEO", "IRQ", "ISR", "JOR", "KWT", "LBN", "OMN", "QAT", "SAU", "SYR", "TUR", "ARE", "YEM"] ~ Column name (new data) + + New values: 22 / 334 (6.59%) code name UNSD_CAM Central America (UNSD) UNSD_NAF Northern Africa (UNSD) UNSD_POL Polynesia (UNSD) UNSD_SAF Southern Africa (UNSD) UNSD_WAS Western Asia (UNSD) ~ Column ncd_code (new data) + + New values: 22 / 334 (6.59%) code ncd_code UNSD_CAM NaN UNSD_NAF NaN UNSD_POL NaN UNSD_SAF NaN UNSD_WAS NaN ~ Column penn_code (new data) + + New values: 22 / 334 (6.59%) code penn_code UNSD_CAM NaN UNSD_NAF NaN UNSD_POL NaN UNSD_SAF NaN UNSD_WAS NaN ~ Column region_type (new data) + + New values: 22 / 334 (6.59%) code region_type UNSD_CAM aggregate UNSD_NAF aggregate UNSD_POL aggregate UNSD_SAF aggregate UNSD_WAS aggregate ~ Column related (new data) + + New values: 22 / 334 (6.59%) code related UNSD_CAM NaN UNSD_NAF NaN UNSD_POL NaN UNSD_SAF NaN UNSD_WAS NaN ~ Column short_name (new data, changed data) + + New values: 22 / 334 (6.59%) code short_name UNSD_CAM Central America (UNSD) UNSD_NAF Northern Africa (UNSD) UNSD_POL Polynesia (UNSD) UNSD_SAF Southern Africa (UNSD) UNSD_WAS Western Asia (UNSD) ~ Changed values: 2 / 334 (0.60%) code short_name - short_name + BIH Bosnia and Herzegovina Bosnia and Herz. TCA Turks and Caicos Islands Turks and Caicos ~ Column successors (new data) + + New values: 22 / 334 (6.59%) code successors UNSD_CAM NaN UNSD_NAF NaN UNSD_POL NaN UNSD_SAF NaN UNSD_WAS NaN ~ Column unctad_code (new data) + + New values: 22 / 334 (6.59%) code unctad_code UNSD_CAM NaN UNSD_NAF NaN UNSD_POL NaN UNSD_SAF NaN UNSD_WAS NaN ~ Column wikidata_code (new data) + + New values: 22 / 334 (6.59%) code wikidata_code UNSD_CAM NaN UNSD_NAF NaN UNSD_POL NaN UNSD_SAF NaN UNSD_WAS NaN Legend: +New ~Modified -Removed =Identical Details Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet ``` Automatically updated datasets matching _weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk_ are not included

Edited: 2024-11-06 18:09:27 UTC Execution time: 894.19 seconds