travishathaway / osmprj

OpenStreetMap Project Manager 👷‍♀️👷‍♂️ 🌎
GNU General Public License v3.0
15 stars 2 forks source link

Add support for Global Human Settle Layer datasets #1

Open travishathaway opened 2 years ago

travishathaway commented 2 years ago

The global human settlement layer (GHSL) offers a global data set for population densities. Combining this with OSM data could provide for some pretty interesting analysis. I am creating this issue as a way to document how I go about adding support for this data in this project.

Acceptance Criteria

travishathaway commented 2 years ago

Here's the method I've come up so far for this (written as a bash script):

# Download the whole data set first
curl -O -L https://cidportal.jrc.ec.europa.eu/ftp/jrc-opendata/GHSL/GHS_POP_MT_GLOBE_R2019A/GHS_POP_E2015_GLOBE_R2019A_54009_250/V1-0/GHS_POP_E2015_GLOBE_R2019A_54009_250_V1_0.zip

# Unzip to preferred location

# Reproject data set to EPSG:3857
gdalwarp \
  -t_srs EPSG:3857 \
  GHS_POP_E2015_GLOBE_R2019A_54009_250_V1_0.tif \
  GHS_POP_tmp.tif

# Compress; the temp file we just created is about 45GB!!!
gdal_translate \
  -co compress=lzw \
  GHS_POP_tmp.tif \ 
  GHS_POP_E2015_GLOBE_R2019A_54009_250_V1_0-3857.tif

rm GHS_POP_tmp.tif

# Extract based on a country polygon
gdalwarp \ 
    -of GTiff \
    -cutline ../natural_earth_datasets/germany.geojson \
    -crop_to_cutline GHS_POP_E2015_GLOBE_R2019A_54009_250_V1_0-3857.tif \
    germany_population.tiff 

# Import into PostgreSQL
# This step can also be piped directly to the psql command
raster2pgsql \
  -c \
  -s 3857 \
  -t auto \
  -I \
  -M \
  germany_population.tiff \
  public.pop_data > germany_pop.sql

At this point, we are ready to begin querying our database.

The only thing that I don't like about the above solution is that when I reproject to a different SRS, the data set balloons to 45GB!!! (from only being about 500MB). I have no idea why this is happening, but it seems like the solution is just to compress it again and delete the temp file.