Closed JinIgarashi closed 2 weeks ago
Let me update what we ended up with zonal stats work since you left. Joseph was trying to generate zonal stats by using your approach described in github repository (https://github.com/UNDP-Data/geo-cellular-automata/blob/main/hreaibm.md).
This approach is trying to estimate electricity access rate by using the number of pixels which has more than 80% electricity access rate. The percentage is calculated from dividing the number of pixels with electricity by the total number of pixels. However, we discovered this approach does not give us better figures. We think population data each pixel level is needed to estimate better access rate. that is what old admin data did to calculate electricity access rate from population. But we don't have population data for forecast one (2021-2030). we need to find different method to estimate future population and the percentage of electrified.
In conclusion, it is not feasible to include new zonal stats by this week (even by end of next week). We can deploy new electricity dashboard tomorrow without new zonal stats (we keep existing zonal stats as it is). In the future, we will find time to create zonal stats for 2021-2030.
Additionally, I want to let you know there is a new population dataset managed by OCHA. It is different from world pop. It has global vector population data from 2020 to 2023. As you can see preview of the below URL, this data has hexagon polygons where there is settlement, and each polygon has the number of population. We maybe can intersect this population data with our 1km aggregated raster to calculate the population each pixel. For the population after 2024, we may use this data to estimate population growth each polygon (maybe using 2022 and 2023 data) for future population. Then create global population datasets from 2024 to 2030 to get electricity access rate.
Proposed steps to regenerate zonal stats for forcast electricity access (2021-2030) are:
Kontur population data is available from 2021 to 2023.
each year's population data has hexagon polygons. we computer the no of population with electricity access
, no of population without electricity access
for each polygon. and add to pop_hrea_2021
and pop_no_hrea_2021
to the population geopackage in 2021 (repeat same process until 2023).
Current admin data is stored at the following blob storage.
These geopackages store columns for zonal stats from 2012 to 2020 like below:
To minimise our work, we reuse this existing admin data with stats as much as possible. In addition, maybe we don't need for admin3 and 4. computing data for admin 3 and 4 level may take much longer time.
In step 1, we added population data to kontur geopackage. Now we can use exactextract
to compute electricity access rate for each admin polygon by using step 1 result.
exactextract is described at hreaibm.md.
after this step, hrea_2021
to hrea_2023
, pop_hrea_2021
to pop_hrea_2023
and pop_no_hrea_2021
to pop_no_hrea_2023
should be added to downloaded admin data (from 0 to 4)
There is no future population data from 2024. We need to do this by several steps:
This method may be used for forecasting population.
pop_2024
to pop_2030
to geopackage.Using the following forecast data by combining estimated population, compute no of population with electricity access
, no of population without electricity access
for each polygon.
do the same process of step 3 to merge 2024 to 2030 data to current admin data.
to do from admin 0 to admin 4
to do from admin 0 to admin 4
fgb
, gpkg
and pmtiles
to blob storageto do from admin 0 to admin 4
we can upload all files to https://undpgeohub.blob.core.windows.net/hrea
container.
If all columns name follow existing admin data, we can minimize changes on frontend.
zonal stats for kontur population datasets (output of step 1 above) are:
the below 2020 data is for comparison with existing 2020 data
zonal stats approarch for vectors by geopandas
population forecast algorithm is in these papers. maybe one of them can be used
Projecting 1 km-grid population distributions from 2020 to 2100 globally under shared socioeconomic pathways
Projecting a Gridded Population of the World Using Ratio Methods of Trend Extrapolation
Step 1 - 3 are fixed by #3850
I have uploaded new tiles with identical structure at https://undpgeohub.blob.core.windows.net/hrea/admin/forecast. Additionally the fgb files are also uploaded
follow steps of
Generate admin level zonal stats
at https://github.com/UNDP-Data/geo-cellular-automata/blob/main/hreaibm.mdUpload PMTiles to hrea container of blobstorage
https://github.com/UNDP-Data/geohub/blob/743d8716107ec82b6f3e03e9cf172f65f817bfb7/sites/geohub/src/routes/(map)/dashboards/electricity/utils/adminLayer.ts#L128-L151
Currently using static pbf tiles. we might need to change quite a lot of codes by switching static pbf to PMTiles. also need to check if there is any changes on column names.
In the line chart component, some code uses data from admin. maybe this code needs to be modified.
https://github.com/UNDP-Data/geohub/blob/b3edab4cc0f9b5c91a5fee201b4b39bb455f1648/sites/geohub/src/routes/(map)/dashboards/electricity/components/Charts.svelte#L146-L180