Open victofs opened 4 years ago
Are all of the results NaN, or just some of them? If a polygon doesn't intersect any defined cell in your raster, the mean will be 0/0 = NaN.
I can provide some feedback (for this closed and old issue) since I experimented the same today with exactextractr_0.9.0
(same with exactextractr_0.7.2
and exactextractr_0.8.2
) installed from conda (r-exactextract). Trying to extract some values from a GLDAS product, I got the following values:
> v <- read_sf('./some/shape.shp')
> r <- rast('./some/raster.tif')
> exact_extract(x = r,
y = v[8,],
include_cols = "gauge_id",
progress = F)
[[1]]
gauge_id value coverage_fraction
1 1041002 0 0.13134970
2 1041002 NaN 0.45708704
3 1041002 NaN 0.11274476
4 1041002 NaN 0.03988117
5 1041002 NaN 0.13633704
6 1041002 NaN 0.02293241
While in other R installation (outside conda environment with exactextractr_0.9.0
) for the same task I got:
[[1]]
gauge_id value coverage_fraction
1 1041002 0.08750 0.13134970
2 1041002 0.13000 0.45708704
3 1041002 0.03750 0.11274476
4 1041002 0.04250 0.03988117
5 1041002 0.07000 0.13633704
6 1041002 0.00875 0.02293241
Session info where the task fails:
R version 4.1.3 (2022-03-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Big Sur/Monterey 10.16
Matrix products: default
BLAS/LAPACK: /Users/aldotapia/miniforge3/envs/hidrocl/lib/libopenblasp-r0.3.20.dylib
locale:
[1] C/UTF-8/C/C/C/C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] terra_1.5-21 sf_1.0-7 exactextractr_0.9.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.9 raster_3.5-21 magrittr_2.0.3 units_0.8-0
[5] lattice_0.20-45 rlang_1.0.4 fansi_1.0.3 tools_4.1.3
[9] grid_4.1.3 KernSmooth_2.23-20 utf8_1.2.2 cli_3.3.0
[13] e1071_1.7-11 DBI_1.1.3 class_7.3-20 tibble_3.1.8
[17] lifecycle_1.0.1 vctrs_0.4.1 codetools_0.2-18 glue_1.6.2
[21] sp_1.5-0 proxy_0.4-27 compiler_4.1.3 pillar_1.8.1
[25] classInt_0.4-7 pkgconfig_2.0.3
Session info where the task runs ok:
R version 4.2.0 (2022-04-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.5
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] exactextractr_0.9.0 terra_1.5-42 sf_1.0-7
loaded via a namespace (and not attached):
[1] Rcpp_1.0.9 rstudioapi_0.13 raster_3.5-21 magrittr_2.0.3 units_0.8-0
[6] tidyselect_1.1.2 lattice_0.20-45 R6_2.5.1 rlang_1.0.4 fansi_1.0.3
[11] dplyr_1.0.9 tools_4.2.0 grid_4.2.0 utf8_1.2.2 KernSmooth_2.23-20
[16] cli_3.3.0 e1071_1.7-11 DBI_1.1.3 ellipsis_0.3.2 class_7.3-20
[21] assertthat_0.2.1 tibble_3.1.7 lifecycle_1.0.1 crayon_1.5.1 purrr_0.3.4
[26] vctrs_0.4.1 codetools_0.2-18 glue_1.6.2 sp_1.5-0 proxy_0.4-27
[31] compiler_4.2.0 pillar_1.7.0 generics_0.1.3 classInt_0.4-7 pkgconfig_2.0.3
I tried to update R to 4.2.0 or 4.2.1 in my conda env without success, so far I'm using a scale factor as workaround and works alright for my needs:
exact_extract(x = r*100,
y = v[8,],
include_cols = "gauge_id",
progress = F)
[[1]]
gauge_id value coverage_fraction
1 1041002 8.750 0.13134970
2 1041002 13.000 0.45708704
3 1041002 3.750 0.11274476
4 1041002 4.250 0.03988117
5 1041002 7.000 0.13633704
6 1041002 0.875 0.02293241
A bit more on context:
The scale factor inside the R function wasn't enough. I'm processing files in Python and exporting them to R for using the exact_extract computing weighted stats and the percent of valid pixels in the analysis.
So I applied a scale factor while doing the process in Python and saved the file in uint8 data type, everything works great so far. Now the output takes all the valid pixels when using floating point rasters some zones took 0 pixels (NaN output) or 30 - 50%, which is wrong knowing the data.
Any hints or ideas of this behavior?
Thanks for documenting this. Is my hunch right that any of the following modifications (applied by itself) would "fix" this?
r <- raster('./some/raster.tif')
x = r*1
y = v[8:9, ]
Yes, you're right. I suppose it's related with raster's statistics, r*1
made the trick.
I tested it with 6 dates and +400 polygons, everything looks fine so far (with r
and with r*1
):
I need to run it on +7000 dates, so if I face another issue related with this function, I'm gonna provide more feedback.
Thank you!
r*1
is just a trick to force terra
to load everything into memory. It appears that either there is a problem with the way I'm requesting cell values from an out-of-memory raster with terra
, or there's a problem within terra
itself. Having a reproducible case would help figure out a solution rather than a workaround.
Sure, you can access to data in this drive folder
I also saved the result and sessionInfo in workspace.RData. Also, this example is the extraction of one of the +400 vectors used. Some of them have the same issue.
The script used is:
setwd('test_path')
v <- 'HidroCL_boundaries.shp'
r <- 'snow_gldas_A20000102.tif'
result <- exactextractr::exact_extract(x = terra::rast(r),
y = sf::read_sf(v)[8,],
include_cols = "gauge_id",
progress = F)
session_info <- sessionInfo()
save.image('~/hidrocl_test/workspace.RData')
I am trying to use exact_extract function where the shapefile is the one used in the Basic Usage section of the readme file
I have the following RasterLayer:
Then i run the exact_extract function
brazil$mean <- (brazil, my_raster, 'mean')
which returns me NaN values, as shown below
Am i doing something wrong importing the raster or this is a known issue in the package?