isciences / exactextractr

R package for fast and accurate raster zonal statistics
https://isciences.gitlab.io/exactextractr/
281 stars 26 forks source link

NetCDF time stab #44

Closed r-barnes closed 3 years ago

r-barnes commented 3 years ago

I'm loading a NetCDF file containing temperatures for a 12 month period (1 per month) with:

my_brick = brick("my_file.nc", var="temperature")

And have a shapefile my_shape.

I'd like to get the average value of temperature across space and time for each region in the shapefile.

Using

out = exact_extract(my_brick, my_shape, stack_apply=FALSE, fun=function(values, coverage){
   length(values)
})

gives 12 - so presumably one value for each month in the time period... but that seems to imply a spatial reduction.

Using

out = exact_extract(my_brick, my_shape, stack_apply=TRUE, fun=function(values, coverage){
   length(values)
})

gives numbers of values which are not divisible by 12, so it feels like something has gone wrong since if it was just handing me a data frame of all the spatial values for each month then the number of values should be evenly divisible by 12.

Could you clarify how to get this spatial+temporal average?

dbaston commented 3 years ago

With stack_apply = TRUE, you're processing each layer in the brick independently, and values is a vector. So length(values) should correspond to the number of features in my_shape.

With stack_apply = FALSE you're processing all layers together, and values is a data frame with a column for each layer in the brick. R being R, length(values) provides the number of columns in values, which is 12, and nrow(values) should be the number of features in my_shape.

If I understand right, you're looking for a single temperature for each polygon ("annual mean temperature") ? If that's the case, you want stack_apply = FALSE with a summary function like

function(values, coverage) {
  # values is a 12-column data frame, rowMeans takes the mean of these 12 columns
  weighted.mean(rowMeans(values), coverage)
}

This would treat each month as having same weight/number of days. If capturing those differences are important you'd need something more sophisticated.

dbaston commented 3 years ago

Did this work for you, @r-barnes ?

r-barnes commented 3 years ago

Yes, it worked quite well, thank you!

Apologies for the delay, I was writing up a more complete explanation here: https://gis.stackexchange.com/a/382046/3924