isciences / exactextractr

R package for fast and accurate raster zonal statistics
https://isciences.gitlab.io/exactextractr/
279 stars 26 forks source link

Specifying minimum coverage threshold for polygon? #108

Open philament opened 3 months ago

philament commented 3 months ago

Thanks for this really nice package. Here is a small feature request: I think it would be useful to have a simple way of including a minimum coverage threshold that is required for a function to return a valid value For example, I would like to calculate the raster means for a set of polygons but only if at least e.g. 80% of the area of a given polygon has non-NA values, otherwise the summary mean for this polygon should be NA. So maybe an argument such as min_cov_threshold = 0.8 or similar?

Anyway, I guess I could always use a user-defined function for this somehow but it might be more efficient to have something like this implemented with the predefined summary operations?

Also, if I have overlooked this possibility in the documentation or you have advice on how to implement this in an simple/efficient manner I am all ears.

dbaston commented 3 months ago

For now, the cleanest way to do this is probably with a user-defined function.

With the Python version and CLI versions of exactextract you can provide arguments like default_value to individual operations directly. This would let you compute count both with and without NA values, from which you could figure out the NA percent in a post-processing step and decide if you want to retain the result or not.

This is shown in: https://github.com/isciences/exactextract/blob/afb834bc9f13d0b55952fe91e4cbc7c840c7d10f/python/tests/test_exact_extract.py#L475

Eventually I intend to unify the R and Python versions, making this syntax available in R as well.