Open zackarno opened 3 days ago
@zackarno I'll take a look into this! On first look though, it's not unexpected to see some NULL values (especially at the adm2 level). This happens in cases where the admin polygon is too small to have any pixel centroids contained within it.
yea that makes sense. So we'd either need to adjust the raster stats method or develop guidance on how we should deal with this in downstream analysis.
For Floodscan use-case we are publishing datasets at admin 2 level so it seems not ideal to have to exclude admins from the datasets.
So we'd either need to adjust the raster stats method or develop guidance on how we should deal with this in downstream analysis.
Yeah @zackarno I think our options would be to:
exactextract
)"Note that some administrative boundaries may not have summary statistics available. This happens when administrative polygons are sufficiently small relative to the size of the input raster dataset. In this case, we'd recommend performing your analysis across a larger spatial scale. For example, if you find values missing for a particular Admin 2 boundary, you may want to instead consider performing your analysis at the Admin 1 level."
I think we should do 1. eventually, but not prioritize at the moment and for now go with 2.
yeah perhaps exactextract
can be used in a future iteration/version. I think what you wrote sounds pretty good, but lets leave this open until a decision is made.
There are some additional complexities coming to mind and one is the fact that we will need to use both NA
and Inf
values in the outputs for different reasons. For example if all values in historical record are 0
or there is 0
variance we need to use something like NA, but we also will have an RP threshold above which values will be Inf
..... still trying to think of the best way to do this all given that we want the users with excel-only skill to be easily able to work with the data and this column specifically in a quantitative way (i.e we can't mix in strings etc)
There appears to be NA/NULL values in the zonal stats
here is the SQL query to see them:
They are all from
MOZ
andNGA
. A lot of occurences over the dates:but only 14 pcodes in total with this issue on
mean
i assume it's the same for other stats exceptcount
andsum