onthegomap / planetiler

Flexible tool to build planet-scale vector tilesets from OpenStreetMap data fast
Apache License 2.0
1.47k stars 116 forks source link

World landcover dataset #550

Open msbarry opened 1 year ago

msbarry commented 1 year ago

It would be nice to be able to generate a world landcover vector tileset. A few possible data sources:

Notes on how we might be able to generate:

Basically just gdal_sieve.py -> gdal_polygonize.py. At 500m resolution the data is reasonably small, so nothing fancy needed. I did the smoothing in the Planetiler profile using Chaikin's algorithm

Originally posted by @erik in https://github.com/onthegomap/planetiler/issues/505#issuecomment-1460949887

wipfli commented 1 year ago

I am curious to see how sieve will perform. My worry is that if islands of forest for example get too small, the result will not show any forest even if say 50 percent of a given area are forest... Similar for built-up area which is a bit fractal in nature too... But let's see!

erik commented 1 year ago

Islands of forests/urban areas (each smaller than the threshold size) should still work with this strategy in most cases. sieve will keep merging polygon areas with the largest neighbor until one is over the threshold size. See: https://github.com/OSGeo/gdal/blob/3f1d8bccf6a7023b83d671b6f4f1cbf26d953555/alg/gdalsievefilter.cpp#L457-L459

wipfli commented 1 year ago

Very badly explained and documented, but I was playing a bit around with using h3 to downsample high-resolution raster landcover data to low resolution vector tiles using H3 cells.

image

On the left you have the forest landcover with H3, and on the right you have the forest landcover derived from OSM vector data. So not exactly the same methods, but anyway, I though it would be fun to share here.

Update: The H3 landcover demo is now finished and looks like this:

image

boldtrn commented 1 year ago

Global landcover would be awesome! Something that immediately cought my attention here was that Australia appears all green, where in reality it should be mostly desert. I am not sure if this is an issue with the data?

msbarry commented 1 year ago

@wipfli would it be possible to run your same pipeline using this as an input: https://livingatlas.arcgis.com/landcover/ - it seems to have more reasonable values like desert in American southwest, and Australia. I believe this file will give you the whole dataset: https://lulctimeseries.blob.core.windows.net/lulctimeseriespublic/lc2022/lulc2022.zip

wipfli commented 1 year ago

Let me download that file...

wipfli commented 1 year ago

The lulc file looks like an amazing dataset with super high resolution (I think 10m per pixel). So this can effectively be used to show forests and even rivers on a map. Where I am not sure is that it has some distinction for dry regions versus alpine regions. Both seem to be characterized with a value of 11:

Alps:

image

Spain and Maghreb:

image

wipfli commented 1 year ago

Maybe we need to combine landcover with climate zone data... https://zenodo.org/record/7324909#.ZFKLDtJBzOE

wipfli commented 1 year ago

The Sentinel-2 dataset has these classes (https://www.arcgis.com/home/item.html?id=fc92d38533d440078f17678ebc20e8e2):

1. Water

Areas where water was predominantly present throughout the year; may not cover areas with sporadic or ephemeral water; contains little to no sparse vegetation, no rock outcrop nor built up features like docks; examples: rivers, ponds, lakes, oceans, flooded salt plains.

2. Trees

Any significant clustering of tall (~15-m or higher) dense vegetation, typically with a closed or dense canopy; examples: wooded vegetation, clusters of dense tall vegetation within savannas, plantations, swamp or mangroves (dense/tall vegetation with ephemeral water or canopy too thick to detect water underneath).

4. Flooded vegetation

Areas of any type of vegetation with obvious intermixing of water throughout a majority of the year; seasonally flooded area that is a mix of grass/shrub/trees/bare ground; examples: flooded mangroves, emergent vegetation, rice paddies and other heavily irrigated and inundated agriculture.

5. Crops

Human planted/plotted cereals, grasses, and crops not at tree height; examples: corn, wheat, soy, fallow plots of structured land.

7. Built Area

Human made structures; major road and rail networks; large homogenous impervious surfaces including parking structures, office buildings and residential housing; examples: houses, dense villages / towns / cities, paved roads, asphalt.

8. Bare ground

Areas of rock or soil with very sparse to no vegetation for the entire year; large areas of sand and deserts with no to little vegetation; examples: exposed rock or soil, desert and sand dunes, dry salt flats/pans, dried lake beds, mines.

9. Snow/Ice

Large homogenous areas of permanent snow or ice, typically only in mountain areas or highest latitudes; examples: glaciers, permanent snowpack, snow fields.

10. Clouds

No land cover information due to persistent cloud cover.

11. Rangeland

Open areas covered in homogenous grasses with little to no taller vegetation; wild cereals and grasses with no obvious human plotting (i.e., not a plotted field); examples: natural meadows and fields with sparse to no tree cover, open savanna with few to no trees, parks/golf courses/lawns, pastures. Mix of small clusters of plants or single plants dispersed on a landscape that shows exposed soil or rock; scrub-filled clearings within dense forests that are clearly not taller than trees; examples: moderate to sparse cover of bushes, shrubs and tufts of grass, savannas with very sparse grasses, trees or other plants.

wipfli commented 1 year ago

Köppen climate classification seems to be the right key-word. Here is a 1 km resolution sample:

https://www.gloh2o.org/koppen/

image

dBitech commented 1 year ago

I just did a quick processing of a sentinal-2 landcover dataset (2022).

gdal_polygonize, then using grass v.simple (calkins) & v.clean on the resulting dataset to remove the stepping artifacts. (I did all this from within qgis to allow for quick observations) image We see that this lines up cleanly with sat imagery

wipfli commented 1 year ago

I played around more with the Koppen dataset. Preparing a demo now, but the main insight was that it is good to filter the raster images with a median filter before polygonizing them. That way, douglas peuker has less work and creates fewer sharp-looking artifacts. Will share a link once I have the median filter demo...

Update: here is a link to the median filter demo: https://wipfli.github.io/median-filter-landcover

Source: https://github.com/wipfli/median-filter-landcover

wipfli commented 1 year ago

I wrote down some notes about some challenges of making a world landcover vector tileset: https://gist.github.com/wipfli/6386f21fa19af978dc9e1f136a754024

I hope one day we can get there...

boldtrn commented 1 year ago

Looking at the post from @wipfli, I wondering a bit about the usage. The example from the Koppen dataset seems to be pretty almost perfect for z0-z6. Wouldn't you normally switch to OSM somewhere around z6-z8 anyway? You mention upscaling for z14 and above and I am wondering if this level of details is even needed?

rene-mueller commented 11 months ago

Maybe this repo helps: https://github.com/lukasmartinelli/naturalearthtiles or this: https://github.com/klokantech/naturalearthtiles

msbarry commented 11 months ago

There's also https://daylightmap.org/2023/11/27/urban.html - has anyone tried that yet?

wipfli commented 10 months ago

I think @bdon used the daylight landcover the other day.

I worked a bit more on the ESA Worldcover dataset. See https://github.com/wipfli/esa-worldcover-polygons

rene-mueller commented 10 months ago

@wipfli wow the esa worldcover is amazing. Thank you for that work. Is it possible to get the same tiles in mbtiles format?

wipfli commented 10 months ago

Thank you. Yes I think you can just change the output format of planetiler to mbtiles.

rene-mueller commented 10 months ago

@wipfli I have seen that maptiler offers a Landcover tileset. According to the copyright, they also use the ESA LandCover but from 2017. Maybe this will help you to improve your script. https://data.maptiler.com/downloads/dataset/landcover/#0.45/25.9/56.8

rene-mueller commented 10 months ago

@msbarry i think, the result of https://github.com/wipfli/esa-worldcover-polygons repo is quite good and detailed. When https://github.com/wipfli/esa-worldcover-polygons/pull/1 is merged, it could be integrated in the rendering process. What do you think about these plans?

msbarry commented 10 months ago

Yeah I think a worldcover dataset with bathymetry added would end up looking really good at lower zoom levels. I'd like to eventually switch onthegomap from natural earth shaded relief raster tiles to a vector tiles like that.

rene-mueller commented 10 months ago

@msbarry I have a solution for bathymetry. Unfortunately the conversion is faulty. Can you possibly take a look at this? https://github.com/wipfli/esa-worldcover-polygons/issues/4