r-spatial / rgee

Google Earth Engine for R
https://r-spatial.github.io/rgee/
Other
677 stars 146 forks source link

EEEexception: Computation Timed out #212

Closed ifeanyi588 closed 2 years ago

ifeanyi588 commented 2 years ago

I wrote the following function to pull data from the google earth engine server. I created the function this way to deal with the fact that GEE server often complains about the size of shapefile I am using. This code as you can see chops the shapefile into smaller bits and runs the queries to GEE before recombining all the data in R and returning it out to the console. I have also included the function call as well.

gee_pullbigdata <- function(email = "ifeanyi.edochie@gmail.com", shp_dsn, shp_layer, gee_name = 'NOAA/VIIRS/DNB/MONTHLY_V1/VCMCFG', gee_datestart = "2019-01-01", gee_dateend = "2019-12-31", gee_band = "avg_rad", gee_chunksize = 4000, gee_stat = "mean", gee_scale = 1000){

shp_dt <- st_read(dsn = shp_dsn, layer = shp_layer)

shp_names <- colnames(shp_dt)

gee_map <- ee$ImageCollection(gee_name)$ filterDate(gee_datestart, gee_dateend)$ map(function(x) x$select(gee_band))

cut the dataset into multiple parts

shp_list <- split(shp_dt, (as.numeric(rownames(shp_dt))-1) %/% gee_chunksize)

a simple function to extracting data into gee_map chunk

counter <- 0
#### compute mean
if (gee_stat %in% "mean"){
  extract_chunk <- function(X){

    specify_decimal <- function(x, k) trimws(format(round(x, k), nsmall=k)) ### round to decimal place

    counter <<- counter + 1
    print(paste0("GEE Collection query ",counter, " of ", length(shp_list), " initiated"))
    y <- ee_extract(x = gee_map,
                    y = X,
                    fun = ee$Reducer$mean(),
                    scale = gee_scale)
    print(paste0("Query complete, GEE job ",
                 specify_decimal((counter * 100)/length(shp_list), 2), "% completed!"))

    return(y)
  }

  dt <- lapply(X = shp_list,
               FUN = extract_chunk)

  dt <- rbindlist(dt)

}

#### compute min
if (gee_stat %in% "min"){
  extract_chunk <- function(X){
    counter <<- counter + 1
    print(paste0("GEE Collection query ",counter, " of ", length(shp_list), " initiated"))
    y <- ee_extract(x = gee_map,
                    y = X,
                    fun = ee$Reducer$min(),
                    scale = gee_scale)
    print(paste0("Query complete, GEE job ",
                 specify_decimal((counter * 100)/length(shp_list), 2), "% completed!"))

    return(y)
  }

  dt <- lapply(X = shp_list,
               FUN = extract_chunk)

  dt <- rbindlist(dt)

}

#### compute max
if (gee_stat %in% "max"){
  extract_chunk <- function(X){
    counter <<- counter + 1
    print(paste0("GEE Collection query ",counter, " of ", length(shp_list), " initiated"))
    y <- ee_extract(x = gee_map,
                    y = X,
                    fun = ee$Reducer$max(),
                    scale = gee_scale)
    print(paste0("Query complete, GEE job ",
                 specify_decimal((counter * 100)/length(shp_list), 2), "% completed!"))

    return(y)
  }

  dt <- lapply(X = shp_list,
               FUN = extract_chunk)

  dt <- rbindlist(dt)

}

#### compute median
if (gee_stat %in% "median"){
  extract_chunk <- function(X){
    counter <<- counter + 1
    print(paste0("GEE Collection query ",counter, " of ", length(shp_list), " initiated"))

    y <- ee_extract(x = gee_map,
                    y = X,
                    fun = ee$Reducer$median(),
                    scale = gee_scale)
    print(paste0("Query complete, GEE job ",
                 specify_decimal((counter * 100)/length(shp_list), 2), "% completed!"))

    return(y)
  }

  dt <- lapply(X = shp_list,
               FUN = extract_chunk)

  dt <- rbindlist(dt)

}

#### compute StdDev
if (gee_stat %in% "stdDev"){
  extract_chunk <- function(X){
    counter <<- counter + 1
    print(paste0("GEE Collection query ",counter, " of ", length(shp_list), " initiated"))

    y <- ee_extract(x = gee_map,
                    y = X,
                    fun = ee$Reducer$stdDev(),
                    scale = gee_scale)
    print(paste0("Query complete, GEE job ",
                 specify_decimal((counter * 100)/length(shp_list), 2), "% completed!"))

    return(y)
  }

  dt <- lapply(X = shp_list,
               FUN = extract_chunk)

  dt <- rbindlist(dt)

}

dt <- as.data.table(dt)

return(dt)

}

The function call is below as follows:

gee_pullbigdata(shp_dsn = "tests/testdata", shp_layer = "gin_poppoly", gee_name = "COPERNICUS/S5P/NRTI/L3_NO2", gee_datestart = "2018-07-31", gee_dateend = "2018-08-31", gee_band = "tropospheric_NO2_column_number_density", gee_chunksize = 50, gee_stat = "mean", gee_scale = 1113)

gin_poppoly can be found as a GEE asset with the following address: "users/ifeanyiedochie/gin_poppoly"

coverton-usgs commented 2 years ago

I have a computationally similar process (no where near as eloquently written) that uses reduceRegions with a for loop on each polygon in a lengthy shapefile and uses getInfo() to extract a table of results saved locally. (I recognize that this is probably the least efficient process its possible to implement). So far its completed a little over 1500 iterations through the for loop. But the loop has crashed on 3 individual shapes and I get the same error. I have yet to confirm why. However, upon reviewing each of the 3 polygons I see that each is a rather complex shape (rivers or large interconnected water delivery infrastructure) with "holes" where the shapes surround islands, agricultural fields, etc.

Is it possible to tell whether some of you chunks are being processed but others are not?

Also apparently geometry can get too complex for GEE? I am not sure I understand how, but that is the supposition identified here: https://gis.stackexchange.com/questions/309355/computation-timed-out-when-trying-to-get-the-size-of-an-imagecollection-in-goo

csaybar commented 2 years ago

Hi @coverton-usgs sorry for the late reply,

EEEexception: Computation Timed out errors happen on the backend side. I highly recommend you take a lot to the Coding Best Practices and Debugging guides.