rstudio / reticulate

R Interface to Python
https://rstudio.github.io/reticulate
Apache License 2.0
1.66k stars 328 forks source link

Extracting data from Google Earth Engine #1636

Open TaniaBarychka opened 1 month ago

TaniaBarychka commented 1 month ago

Hello,

I'm using reticulate to extract datasets from the Dynamic World collection in Google Earth Engine.

I've been accessing the GEE datasets successfully for a number of weeks. Last Friday, 12th July 2024, I got the following error at the point of downloading the dataset:

Error in py_call_impl(callable, call_args$unnamed, call_args$named) : 
  requests.exceptions.SSLError: HTTPSConnectionPool(host='earthengine.googleapis.com', port=443): Max retries exceeded with url: /v1/projects/earthengine-legacy/value:compute?prettyPrint=false&alt=json (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:992)'))).

I'm appending the code and the outputs below. Please please if you can be very detailed in any python related or command line related checks as I'm not familiar with python. I use command line mostly for version control.

Thank you!! Tania


R version 4.4.0 (2024-04-24 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)

> # R version 4.4.0 (2024-04-24 ucrt)
> # Platform: x86_64-w64-mingw32/x64
> # Running under: Windows 11 x64 (build 22631)
> 
> Sys.which("python")   # system default
                              python 
"C:\\PROGRA~1\\PYTHON~1\\python.exe" 
> Sys.which("python3")  # is a V3 installed?
python3 
     "" 
> np <- reticulate::import("numpy", convert = FALSE)
> reticulate::py_available()
[1] TRUE
> reticulate::py_discover_config()
python:         C:/Program Files/Python311/python.exe
libpython:      C:/Program Files/Python311/python311.dll
pythonhome:     C:/Program Files/Python311
version:        3.11.0 (main, Oct 24 2022, 18:26:48) [MSC v.1933 64 bit (AMD64)]
Architecture:   64bit
numpy:          C:/Users/barychkat/AppData/Roaming/Python/Python311/site-packages/numpy
numpy_version:  1.26.4

NOTE: Python version was forced by RETICULATE_PYTHON_FALLBACK
> ee_check() # Check non-R dependencies
◉  Python version
✔ [Ok] C:/Program Files/Python311/python.exe v3.11
◉  Python packages:
✔ [Ok] numpy
✔ [Ok] earthengine-api
> ee_clean_pyenv() # Remove reticulate system variables
> ee_Authenticate()
 ✔ Initializing Google Earth Engine:To authorize access needed by Earth Engine, open the following URL in a web browser and follow the instructions. If the web browser does not start automatically, please manually browse the URL below.

The authorization workflow will generate a code, which you should paste in the box below.
✔ Initializing Google Earth Engine:  DONE!
credentials are cached in the path: C:\Users\barychkat/.config/earthengine/
Successfully saved authorization token.
> ee_Initialize()
── rgee 1.1.7 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── earthengine-api 0.1.370 ── 
 ✔ user: not_defined 
 ✔ Initializing Google Earth Engine:  DONE!
 ✔ Earth Engine account: users/taniabarychka 
 ✔ Python Path: C:/Program Files/Python311/python.exe 
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 

> head(points_on_radius)
  radius       lon      lat
1   5000 -80.38550 42.62731
2   5000 -80.36972 42.62578
3   5000 -80.35503 42.62128
4   5000 -80.34241 42.61412
5   5000 -80.33273 42.60479
6   5000 -80.32666 42.59393

> min_date
[1] "2022-04-05"
> max_date
[1] "2023-12-14"

>   geometries_list <- apply(points_on_radius, 1, function(x) ee$Geometry$Point(x[2], x[3])) #longitude, Latitude
> # Define start and end dates
>   startDate <- as.character(min_date)
>   endDate <- as.character(max_date)
> # Define probability bands
>   probabilityBands <- c(
+     'water', 'trees', 'grass', 'flooded_vegetation', 'crops', 'shrub_and_scrub',
+     'built', 'bare', 'snow_and_ice'
+     )
> # Initialize an empty list to store extracted features
>   extracted_features <- list()
> # Loop through each geometry
>   for (geometry in geometries_list) {
+   # Filter the DynamicWorld Image Collection
+     dw <- ee$ImageCollection('GOOGLE/DYNAMICWORLD/V1')$filterDate(startDate, endDate)$filterBounds(geometry)
+   
+   # Map function to compute mean for each image
+     computeMean <- function(image) {
+       probabilityDict <- image$select(probabilityBands)$reduceRegion(
+         reducer = ee$Reducer$mean(),
+         geometry = geometry,
+         scale = 10
+         )
+     
+     
+     # Convert to feature and set date property
+       feature <- ee$Feature(NULL, probabilityDict)$set('date', image$date()$format('yyyy-MM-dd'))
+       return(feature)
+       }
+   
+   # Use base R map function to apply the computeMean function to each image in the collection
+     dwTimeSeries <- lapply(dw$getInfo()$features, function(x) computeMean(ee$Image(x$id)))
+     
+     # Convert the list of features to an ee.FeatureCollection
+     dwTimeSeries <- ee$FeatureCollection(dwTimeSeries)
+     
+     # Filter out null values
+     dwTimeSeries <- dwTimeSeries$filter(ee$Filter$notNull(probabilityBands))
+     
+     #print(dwTimeSeries$getInfo())
+     
+     # Extract features from the FeatureCollection
+     extracted_features <- rbind(extracted_features, dwTimeSeries$getInfo())
+   
+   #print(extracted_features)
+     }
Error in py_call_impl(callable, call_args$unnamed, call_args$named) : 
  requests.exceptions.SSLError: HTTPSConnectionPool(host='earthengine.googleapis.com', port=443): Max retries exceeded with url: /v1/projects/earthengine-legacy/value:compute?prettyPrint=false&alt=json (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:992)')))
Run `reticulate::py_last_error()` for details.
t-kalinowski commented 1 month ago

If it was working before and all-of-a-sudden stopped working, then it's most likely that something changed with the web API. It might be worthwhile to take a look at https://developers.google.com/earth-engine/changelog and see if anything jumps out.

Also, you might want to ask at https://github.com/r-spatial/rgee, which is where the error appears to be originating from.

TaniaBarychka commented 1 month ago

Thank you for quick response! In the end, the errors I was getting had to do with not having enough storage space on my C Drive. I've cleared some memory and the code is running again.