Closed Tai-Rocha closed 2 years ago
Hi @Tai-Rocha, you must extract statistics by batches. https://r-spatial.github.io/rgee/articles/rgee03.html#reduceregion-vs--reduceregions-vs--for-loop
library(sf)
library(rgee)
library(raster)
library(foreach)
library(rgeeExtra)
ee_Initialize()
# Toy geometry ------------------------------------------------------------
dat <- structure(list(ID = 758432:758443,
lat = c(24.875, 24.875, 24.625, 24.625, 24.875, 24.875, 24.625, 24.625, 24.375, 24.375, 24.125, 24.125),
lon = c(72.875, 72.625, 72.625, 72.875, 72.375, 72.125, 72.125, 72.375, 72.375, 72.125, 72.125, 72.375)),
class = "data.frame", row.names = c(NA, -12L))
dat_rast <- rasterFromXYZ(dat[, c('lon', 'lat', 'ID')], crs = '+proj=longlat +datum=WGS84 +no_defs')
dat_poly <- rasterToPolygons(dat_rast, fun=NULL, na.rm=TRUE, dissolve=FALSE)
startDate = ee$Date('2020-01-01');
endDate = ee$Date('2020-01-10');
# Load the dataset --------------------------------------------------------
ImageCollection = ee$ImageCollection('NASA/NEX-GDDP') %>%
ee$ImageCollection$filter(ee$Filter$date(startDate, endDate))
image <- ImageCollection[[1]]
# Statistics in batches ---------------------------------------------------
statistics <- foreach (index=seq_along(dat_poly), .combine = rbind) %do% {
ee_poly <- dat_poly[index,] %>%
st_as_sfc() %>%
sf_as_ee()
ee_extract(image, ee_poly, ee$Reducer$mean(), sf = TRUE)
}
statistics
@csaybar Thank you so much for the quick reply! I used the batches and unfortunately, after 12 hours running, I got the same error...
Hi @Tai-Rocha here you have a reproducible example, but remember that it is necessary to simplify the geomeries when our dataset has multiple polygons, because they send the request to the "api earth engine" in json format and if the polygons have multiple geometries, the request is blocked, because the size of the json is huge, the best practice is to use st_simplify (sf)
or ms_simplify (rmapshaper).
library(rgee)
library(sf)
library(tidyverse)
library(rmapshaper)
library(pryr)
ee_Initialize()
# 1. Reading vector layer ----------------------------------------
peru <- st_read(
"https://github.com/ambarja/gpkg-pe/raw/main/peru_level3.gpkg"
)
peru_simply <- peru %>%
ms_simplify(keep = 0.001,keep_shapes = TRUE)
object_size(peru)
object_size(peru_simply)
# 2. sf object to featurecollection -------------------------------
peru_ee <- peru_simply %>%
sf_as_ee()
# 3. ImageCollection of interest ----------------------------------
im <- ee$Image('WWF/HydroSHEDS/03VFDEM')$
select('b1')$
rename("elevation")
# 4. Extract information using a zonal statistic -------------------
ee_mean_data <- function(x, y, by = 1000,scale = 1000) {
y_len <- y$size()$getInfo()
for (i in seq(1, y_len, by)) {
index <- i - 1
print(sprintf("Extracting information [%s/%s]...", index, y_len))
ee_value_layer <- ee$FeatureCollection(y) %>%
ee$FeatureCollection$toList(by, index) %>%
ee$FeatureCollection()
if (i == 1) {
dataset <- ee_extract(
x = x,
fun = ee$Reducer$mean(),
y = ee_value_layer,
scale = scale,
sf = T
)
} else {
db_local <- ee_extract(
x = x,
y = ee_value_layer,
fun = ee$Reducer$mean(),
scale = scale,
sf = T
)
dataset <- rbind(dataset, db_local)
}
}
return(dataset)
}
statistics <- ee_mean_data(x = im,y = peru_ee)
> statistics <- ee_mean_data(x = im,y = peru_ee)
[1] "Extracting information [0/1874]..."
[1] "Extracting information [1000/1874]..."
> statistics
Simple feature collection with 1874 features and 5 fields
Geometry type: GEOMETRY
Dimension: XY
Bounding box: xmin: -81.2978 ymin: -18.31278 xmax: -68.65446 ymax: -0.038606
Geodetic CRS: WGS 84
First 10 features:
NOMBDEP NOMBDIST NOMBPROV UBIGEO elevation geometry
1 CAJAMARCA GUZMANGO CONTUMAZA 060504 2789.700467 MULTIPOLYGON (((-78.90743 -...
2 CAJAMARCA EL PRADO SAN MIGUEL 061105 2846.581242 MULTIPOLYGON (((-79.00477 -...
3 CAJAMARCA NIEPOS SAN MIGUEL 061109 2269.360544 MULTIPOLYGON (((-79.20812 -...
4 CAJAMARCA SAN GREGORIO SAN MIGUEL 061110 943.728828 MULTIPOLYGON (((-79.07362 -...
5 CAJAMARCA SAN LUIS SAN PABLO 061203 1329.366966 MULTIPOLYGON (((-78.87211 -...
6 LAMBAYEQUE ETEN CHICLAYO 140103 19.698623 MULTIPOLYGON (((-79.83994 -...
7 LAMBAYEQUE LA VICTORIA CHICLAYO 140106 19.517739 MULTIPOLYGON (((-79.82947 -...
8 LAMBAYEQUE LAGUNAS CHICLAYO 140107 44.651568 MULTIPOLYGON (((-79.64658 -...
9 LAMBAYEQUE MONSEFU CHICLAYO 140108 14.909555 MULTIPOLYGON (((-79.77548 -...
10 LAMBAYEQUE SANTA ROSA CHICLAYO 140114 5.920391 MULTIPOLYGON (((-79.88307 -...
I will close this issue since it is not directly related to rgee. https://groups.google.com/g/google-earth-engine-developers would be a better place to issues related to the client-server model. Thanks @ambarja :rocket:
Thank you @csaybar and @ambarja ! I tried the suggestion of @ambarja but I'm still receiving the error. Now I am almost sure that the easier way is cut the geometry and do the statistics for small polygons (with fewer squares) ...
Hi everyone, I followed this reproducible example to get some statistics (mean, stdDEv, min and max values) for each square of polygon based on NASEDEM elevation data (30-meter resolution).
My issue is: I have millions of squares and I need to get statistics (mean, stdDEv, max , min values, Morans I) for each square. Also, the NASADEM elevation raster is a high resolution. When I try for a polygon with a few squares it works like this example. However, not work for polygons with many squares. I've been trying many ways (reduceRegion, map). But every time I have an error related to memory when I try to export, like
or
Does anyone know how can I get the statistics for large data (polygons with many squares) using GEE API and rgee??
Thanks in advance!