openkfw / mapme.protectedareas

Reproducible workflows in R for processing open geodata to create knowledge about KfW supported protected areas and conservation effectiveness.
GNU General Public License v3.0
3 stars 0 forks source link

Create routine for Analyzing Copernicus Global Land Cover data #26

Closed Jo-Schie closed 3 years ago

Jo-Schie commented 3 years ago

I think we should focus on this first. @Ohm-Np : Can you start to sketch a routine how this data could be analyzed and what the output would be?.

Ohm-Np commented 3 years ago

First we need to decide on which classes to use for our purpose. Brief overview of all the 23 classes in Copernicus Land Cover data listed here:

Map Code Land Cover Class Definition
0 No input data available -
111 Closed forest, evergreen needle leaf tree canopy >70 %, almost all needle leaf trees remain green all year. Canopy is never without green foliage.
113 Closed forest, deciduous needle leaf tree canopy >70 %, consists of seasonal needle leaf tree communities with an annual cycle of leaf-on and leaf-off periods
112 Closed forest, evergreen, broad leaf tree canopy >70 %, almost all broadleaf trees remain green year round. Canopy is never without green foliage.
114 Closed forest, deciduous broad leaf tree canopy >70 %, consists of seasonal broadleaf tree communities with an annual cycle of leaf-on and leaf-off periods.
115 Closed forest, mixed Closed forest, mix of types
116 Closed forest, unknown Closed forest, not matching any of the other definitions
121 Open forest, evergreen needle leaf top layer- trees 15-70 % and second layermixed of shrubs and grassland, almost all needle leaf trees remain green all year. Canopy is never without green foliage.
123 Open forest, deciduous needle leaf top layer- trees 15-70 % and second layermixed of shrubs and grassland, consists of seasonal needle leaf tree communities with an annual cycle of leaf-on and leaf-off periods
122 Open forest, evergreen broad leaf top layer- trees 15-70 % and second layermixed of shrubs and grassland, almost all broadleaf trees remain green year round. Canopy is never without green foliage.
124 Open forest, deciduous broad leaf top layer- trees 15-70 % and second layermixed of shrubs and grassland, consists of seasonal broadleaf tree communities with an annual cycle of leaf-on and leaf-off periods.
125 Open forest, mixed Open forest, mix of types
126 Open forest, unknown Open forest, not matching any of the other definitions
20 Shrubs These are woody perennial plants with persistent and woody stems and without any defined main stem being less than 5 m tall. The shrub foliage can be either evergreen or deciduous.
30 Herbaceous vegetation Plants without persistent stem or shoots above ground and lacking definite firm structure. Tree and shrub cover is less than 10 %.
90 Herbaceous wetland Lands with a permanent mixture of water and herbaceous or woody vegetation. The vegetation can be present in either salt, brackish, or fresh water.
100 Moss and lichen Moss and lichen
60 Bare / sparse vegetation Lands with exposed soil, sand, or rocks and never has more than 10 % vegetated cover during any time of the year
40 Cultivated and managed vegetation/agriculture (cropland) Lands covered with temporary crops followed by harvest and a bare soil period (e.g., single and multiple cropping systems). Note that perennial woody crops will be classified as the appropriate forest or shrub land cover type.
50 Urban / built up Land covered by buildings and other manmade structures
70 Snow and Ice Lands under snow or ice cover throughout the year.
80 Permanent water bodies lakes, reservoirs, and rivers. Can be either fresh or salt-water bodies.
200 Open sea Oceans, seas. Can be either fresh or saltwater bodies.

Further information can be found here.

Jo-Schie commented 3 years ago

Well I would guess we need all categories right? Or are there simplified ones? If not then we need the are of all classes per polygon area.

Ohm-Np commented 3 years ago

Yes, there are simplified ones too but only one raster can be downloaded at a time for one class. This discrete classification is the good one among them. They have currently gridded rasters (20*20) for years 2015 to 2019.

Ohm-Np commented 3 years ago

The routine to process area of different land cover classes for a single polygon using lc_classes: Workflow

However, to process all Protected Area (PAs), we can simply get the results in long table format for a particular year by running a function lc_area_per_polygon.

Here is the script

Ohm-Np commented 3 years ago

Comparison of results with DOPA: 2015

For WDPAID: 2221 Variables DOPA(sqkm) KfW(sqkm) diff(DOPA-KfW)
Area 1351.23 1351.25 -0.02
Shrubs 1045.75 700.80 +344.95
Herbaceous vegetation 74.66 27.92 +46.74
Cropland 12.89 9.79 +3.1
Closed forest, evergreen, broad leaf 3.87 1.28 +2.59
Closed forest, deciduous broad leaf 6.45 0.22 +6.23
Closed forest, unknown 25.86 40.28 -14.42
Open forest, deciduous broad leaf 0.16 0.08 +0.08
Open forest, unknown 181.68 582.78 -401.1
SUM 1351.32 1363.15 -


Also, for WDPAID: 34004 Variables DOPA(sqkm) KfW(sqkm) diff(DOPA-KfW)
Area 32833.89 32833.89 0.00
Shrubs 453.06 419.65 +33.41
Herbaceous vegetation 307.11 420.64 -113.53
Cropland 12.89 1.52 +11.37
Bare/Sparse vegetation 1.14 1.77 -0.63
Permanent Water bodies 81.22 90.52 -9.3
Herbaceous Wetland 11.98 23.86 -11.88
Closed forest, evergreen, broad leaf 31170.62 31198.07 -27.45
Closed forest, deciduous broad leaf 40.61 0.07 +40.54
Closed forest, unknown 420.58 491.85 -71.27
Open forest, evergreen broad leaf 10.79 18.78 -7.99
Open forest, unknown 336.34 235.06 +101.28
SUM 32846.34 32901.83 -

We see very big differences in some of the classes. Since, the raster files and even the area of polygon are same used by dopa and us, I don't know what could be the reason behind this much difference in the results.

Only thing I noticed something erroneus is that, when we match the area of polygon to the summation of the land cover classes area, in both cases(dopa & kfw), they are different, however, the one from dopa somehow is near to the polygon area. Might be the difference in adoption of CRS, still it doesn't justify the difference in area of particular land cover classes.

Jo-Schie commented 3 years ago

Hi @Ohm-Np. The comparison looks totally fine to me. I guess the differences are attributable to the polygon simplification that we apply. You can see the effects of this if you compare original wdpa data to the simplified ones. I think, nevertheless, that this is okay as long as we describe this in the end in the final documentation. Please keep the two tables also foe documentation purposes. Did you also check how many PAs could be identified from our dataset?

Ohm-Np commented 3 years ago

Did you also check how many PAs could be identified from our dataset?

I tried with the get_redlist_status and the result is:

Total number of polygons in our geopackage: 7495 Polygons whose data are found from DOPA: 2891

Now, I am trying with other functions too. I will update here the results.

Updates: Variables Polygons available in DOPA
Redlist status 2891
Redlist Species List 2891
WDPA Level Centroid 5533
Water Stats 2823
Land Cover Copernicus 2718
Land Cover Change ESA 2718
Multiple Indicators 2826

Regarding Ecoregion statistics from DOPA. They do not provide data on area of intersection between polygon and ecoregion rather generate variables (normalized indicator) on ecoregion label. So, it does not help us to make a comparison with the results from our script teow_intersection.

Variables to download:

Ohm-Np commented 3 years ago

Total KfW PA Polygons: 7495 Unique WDPAIDs: 7450

Comparing KfW PA polygons with DOPA Excel sheet

KfW WDPAIDs available in DOPA excel sheet: 2259 KfW WDPAIDs not available in DOPA: 5191
The CSV file containing the list of these 5191 polygons are stored in datalake as: datalake/mapme.protectedareas/processing/wdpa_kfw/polygons_in_kfw_not_in_dopa_excel.csv

Comparing KfW PA polygons with WDPAIDs from DOPA functions

wdpa_level_centroid KfW WDPAIDs available: 5533 KfW WDPAIDs not available: 1917 [polygons_in_kfw_not_in_centroid.csv]

redlist KfW WDPAIDs available: 2891 KfW WDPAIDs not available: 4559 [polygons_in_kfw_not_in_redlist.csv]

water_stats KfW WDPAIDs available: 2823 KfW WDPAIDs not available: 4627 [polygons_in_kfw_not_in_water.csv]

land_cover KfW WDPAIDs available: 2718 KfW WDPAIDs not available: 4732 [polygons_in_kfw_not_in_landcover.csv]

multiple_indicators KfW WDPAIDs available: 2826 KfW WDPAIDs not available: 4624 [polygons_in_kfw_not_in_multiple.csv]

Jo-Schie commented 3 years ago

The CSV file containing the list of these 5191 polygons are stored in datalake as: _datalake/mapme.protectedareas/processing/wdpa_kfw/polygons_in_kfw_not_in_dopaexcel.csv

by any chance. is there still this csv file somewhere? i wanted to send it now to DOPA people but I could not download it before...

Ohm-Np commented 3 years ago

by any chance. is there still this csv file somewhere? i wanted to send it now to DOPA people but I could not download it before...

Sadly, I don't have this csv now, but once the dopa variables are processed, I can re-create the files.

Ohm-Np commented 3 years ago

by any chance. is there still this csv file somewhere? i wanted to send it now to DOPA people but I could not download it before...

Hi @Jo-Schie, I re-created the csv files and are stored in datalake as: _datalake/mapme.protectedareas/processing/doparest/

Jo-Schie commented 3 years ago

Thanks om!!!

Jo-Schie commented 3 years ago

HI @Ohm-Np . Can we close this issue? It seems to me that the routines are created and working, correct?

Ohm-Np commented 3 years ago

Yes, we can close this issue already. Everything is up to date.