r-spatial / rgee

Google Earth Engine for R
https://r-spatial.github.io/rgee/
Other
691 stars 148 forks source link

Not-very-big .tif files get split up into several upon download #371

Open devanmcg opened 1 month ago

devanmcg commented 1 month ago
library(rgee)

# Initialize the Earth Engine module.
ee_Initialize()

# Print metadata for a DEM dataset.
print(ee$Image('USGS/SRTMGL1_003')$getInfo())

YES I can initialize (although I use ee$Initialize(project='ee-xxx') and the metadata comes through:

[1] "Image"

$bands
$bands[[1]]
$bands[[1]]$id
[1] "elevation" 

etc etc etc...

Python (reticulate) configuration:

library(reticulate)
py_config()

Results as follows:

libpython:      C:/Users/.../AppData/Local/r-miniconda/envs/rgee/python38.dll
pythonhome:     C:/Users/.../AppData/Local/r-miniconda/envs/rgee
version:        3.8.20 | packaged by conda-forge | (default, Sep 30 2024, 17:44:03) [MSC v.1929 64 bit (AMD64)]
Architecture:   64bit
numpy:          C:/Users/.../AppData/Local/r-miniconda/envs/rgee/Lib/site-packages/numpy
numpy_version:  1.24.4
ee:             C:\Users\...\AppData\Local\R-MINI~1\envs\rgee\lib\site-packages\ee\__init__.p

NOTE: Python version was forced by RETICULATE_PYTHON

Description

Both ee_as_rast() and ee_imagecollection_to_local() return multiple .tif files that seem much smaller than are necessary. Sure, ee_as_rast() warns us, NOTE: To avoid memory excess problems, ee_as_rast will not build Raster objects for large images., but the files returned with the script below returns 6 files ranging from 143KB to 853KB, which seems well below the threshold for "large images."

Is there any way to ensure/force files that would only be a few MB total to download as a single .tif?

What I Did

pacman::p_load(tidyverse, sf, rgee, rgeeExtra)

ee$Initialize(project='ee-xxx')
extra_Initialize()

spectra_ee_image <- 
  ee$ImageCollection("LANDSAT/LC08/C02/T1_L2")$
  filterBounds(CGREC_PBG) %>% # bbox: xmin = -99.49029, ymin = 46.71877,  xmax =  -99.42742, ymax = 46.77674 
  ee$ImageCollection$Extra_closest(passes[i],  7, "day") %>%  # Gets a LANDSAT image close to a Sentinel2 pass
  ee$ImageCollection$Extra_preprocess() %>%
  ee$Image$Extra_maskClouds(prob = 75,buffer = 300,cdi = -0.5) %>%
  ee$Image$Extra_spectralIndex(c("NBR", 'CSI', "NDVI", 'NDMI', 'MSAVI', 'SAVI')) %>%
  ee$ImageCollection$toBands() 
  ImageDate <- spectra_ee_image$bandNames()$getInfo()[1] %>% str_sub(13,20)
  FileDate = paste(str_sub(ImageDate, 1,4), str_sub(ImageDate, 5,6), str_sub(ImageDate, 7,8), sep = '_')
  names(spectra_ee_image) <- c("NBR", 'CSI', "NDVI", 'NDMI', 'MSAVI', 'SAVI')

ee_as_rast(
  image = spectra_ee_image,
  region = CGREC_NorthPBG, # even smaller bbox; just 260 ha
  dsn = paste0('./landsat/', paste0('CGREC_North_', FileDate)),
               timePrefix = FALSE,
               via = "drive", 
               scale = 30 )

Returns:

- region parameters
 sfg      : POLYGON ((-99.46931 46.7657 .... .77187, -99.47904 46.77252)) 
 CRS      : GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563, ..... 
 geodesic : TRUE 
 evenOdd  : TRUE 

- download parameters (Google Drive)
 Image ID    : CGREC_North_2017_03_22 
 Google user : ndef 
 Folder name : rgee_backup 
 Date        : 2024_10_17_14_43_11 
Polling for task <id: WPHBUP5AHDS5TKXGHXZAVKZC, time: 0s>.
Polling for task <id: WPHBUP5AHDS5TKXGHXZAVKZC, time: 5s>.
Polling for task <id: WPHBUP5AHDS5TKXGHXZAVKZC, time: 10s>.
State: COMPLETED
Moving image from Google Drive to Local ... Please wait  
NOTE: To avoid memory excess problems, ee_as_rast will not build Raster objects for large images.
[1] "./landsat/CGREC_North_2017_03_22-0001.tif" "./landsat/CGREC_North_2017_03_22-0002.tif"
[3] "./landsat/CGREC_North_2017_03_22-0003.tif" "./landsat/CGREC_North_2017_03_22-0004.tif"
[5] "./landsat/CGREC_North_2017_03_22-0005.tif" "./landsat/CGREC_North_2017_03_22-0006.tif" 

Similar with export on the image collection (same as above but without ee$ImageCollection$toBands()):

    ee_imagecollection_to_local(spectra_ee_ic, 
                                region = CGREC_SouthPBG, 
                                dsn = paste0('./landsat/CGREC_South_', ImageDate),
                                timePrefix = FALSE,
                                via = "drive", 
                                scale = 30)