owid / etl

A compute graph for loading and transforming OWID's data
https://docs.owid.io/projects/etl
MIT License
58 stars 18 forks source link

:bar_chart: Reduce memory footprint of surface temperature #2866

Closed Marigold closed 6 days ago

Marigold commented 1 week ago

Use smaller type float32 to reduce memory footprint. Check out data-diff below for differences (it's on the sixth decimal place). If this doesn't work, I guess we can just move the entire _load_data_array to snapshot.

owidbot commented 1 week ago
Quick links (staging server): Site Admin Wizard

Login: ssh owid@staging-site-surf-temp-mem

chart-diff: ✅ No charts for review.
data-diff: ❌ Found differences ```diff = Dataset garden/climate/2023-12-20/surface_temperature = Table surface_temperature ~ Column anomaly_above_0 (changed data) ~ Changed values: 14976 / 197535 (7.58%) country time anomaly_above_0 - anomaly_above_0 + Morocco 1969-03-15 0.080667 0.080668 Laos 2012-09-15 0.210035 0.210032 Cuba 1940-11-15 0.107277 0.107275 Heard Island and McDonald Islands 1997-10-15 0.095267 0.095271 American Samoa 1992-09-15 0.261999 0.261990 ~ Column anomaly_below_0 (changed data) ~ Changed values: 21601 / 197535 (10.94%) country time anomaly_below_0 - anomaly_below_0 + Chile 2013-03-15 -0.164452 -0.164454 Luxembourg 1982-12-15 -0.207335 -0.207333 Myanmar 2005-11-15 -0.003511 -0.003508 United States Virgin Islands 1942-07-15 -0.364576 -0.364595 Comoros 1984-11-15 -0.193054 -0.193058 ~ Column temperature_2m (changed data) ~ Changed values: 2927 / 197535 (1.48%) country time temperature_2m - temperature_2m + Belarus 1982-12-15 0.364818 0.364824 Latvia 1989-11-15 0.613666 0.613674 Andorra 2023-02-15 -0.239236 -0.239227 Moldova 1952-01-15 0.520697 0.520703 Heard Island and McDonald Islands 2009-05-15 0.567243 0.567261 ~ Column temperature_anomaly (changed data) ~ Changed values: 36577 / 197535 (18.52%) country time temperature_anomaly - temperature_anomaly + Palestine 2009-08-15 0.065126 0.065123 Somalia 1988-10-15 -0.147762 -0.147758 Tajikistan 1946-06-15 0.057418 0.057419 Thailand 1973-05-15 -0.653219 -0.653229 Dominican Republic 2017-04-15 0.008352 0.008354 Legend: +New ~Modified -Removed =Identical Details Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet ``` Automatically updated datasets matching _weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk_ are not included

Edited: 2024-06-21 08:27:22 UTC Execution time: 13.96 seconds

Marigold commented 6 days ago

@veronikasamborska1994 could you have a look when you have time, please? Is the reduced precision a problem, or is it ok?