Open Jo-Schie opened 2 years ago
This is definitely something we need to document somewhere. I suggest somewhere where we talk about different engines (so this is slightly related to #69). Depending on the engine and the structure of the data (as indicated by your sketches) you can get different results. In my current understanding, terra::extract
and exactextractr::exact_extract
can be configured to only take into account the proportion a raster cell is covered by a polygon. terra::zonal
thus will result in relatively crude estimates. I think we should address this properly when implementing the engines more thoroughly and include a sensitivity analysis of some kind showing users the differences in the estimates.
Agreed @goergen95 . I would, nevertheless suggest in the meantime that we just open a new chapter in the documentation and call it "Technical details" or something. Later we can rename it to "technical details and engine choice" or similar. Is that okay with you? I can opena branch and make a suggestion. I am not sure if the other engines really differ because they would need to make an intersection of some kind which probably does not occur but I'd be happily surprised if different.
I just noticed as well that inside the WDPA and our portfolio we encounter areas that are smaller 1 sqkm, so this issue matters and users should be aware .
An example of workflow diagram:
That's great. Can we use this figure @Ohm-Np ?
Yes, I have prepared these diagrams for few other variables too.
Another idea I have, slightly related to this, is to let users decide which projection the package should use for their analysis. Maybe I'll open a dedicated issue for this.
I just noticed that it would be very desirable to document somewhere how the "internals" of the area calculations of the package work maybe also adding a little graph.
Background: : @Ohm-Np explained me that the area calculation in the package is done with
crop
->mask
->cellsize
->zonal
. The exactness of this approach depends on the the Area size of the AOI and the resolution of the input raster. This is, because the raster that is used for zonal will intersect the AOI and also eventually cover areas that are outside of the boarders of the AOI. Those areas are included and therefore there will be always an overestimation if areas are being calculated (at least if you calculate the sum of areas).An extreme edge case could be that you have a very small AOI (say 1 hectar) and a very low resolution input raster (say 500x500 meters). The input raster would be cropped, masked and cellsizes would be calculated. You might then eventually end up with e.g. 4 cells that intersect the AOI and have a total area of 1000 x 1000 meters whereas AOI is only 100 x 100 meters.
I don't know if this is relevant at the current stage because area sizes are AFAIK only calculated for forest area, magrove area and land-cover area and all of them have fairly high resolutions between 30 and 100 meters... and for our use-case of using protected areas that are fairly large, the estimations will not deviate a lot... Nevertheless, it could be good to show that to users in order to make them understand, why some of the calculations might give results that are larger then the original AOI (even if the differences are small). Maybe someone uses this package for small AOIs and might have trouble understanding the results.
Small illustration:
Not sure, where this would be a good fit for the documentation and what you think about that issue @goergen95 .