ShiruiH / EOWater

A fast way to retrieve water surface area time-series from Sentinel-2 and Landsat in GEE
GNU General Public License v3.0
22 stars 7 forks source link

Prepare scripts for uploading tiles to GEE Assets #7

Open ShiruiH opened 3 months ago

ShiruiH commented 3 months ago

Upload S2 and Landsat tiles to GEE Assets

kvos commented 3 months ago

good idea! For reading from cloud bucket data, I skipped that part in my PR and in the notebook it's reading local CSV files but here is the way we connect read the files (it's based on a snippet that you sent me a while back).

First, connect to the GCP project (this requires gcloud installed and users to run gcloud auth application-default login to create an Application Default Credentials (ADC).

# library to load cloud bucket
import gcsfs

# create token (based on https://stackoverflow.com/questions/53472429/how-to-get-a-gcp-bearer-token-programmatically-with-python)
import google.auth
import google.auth.transport.requests
creds, project = google.auth.default()
# creds.valid is False, and creds.token is None
# Need to refresh credentials to populate those
auth_req = google.auth.transport.requests.Request()
creds.refresh(auth_req)

Once that is done, reading from the cloud buckets into dataframes :

# mount cloud bucket
fs = gcsfs.GCSFileSystem(project='nsw-dpe-gee-tst',token=creds.token)
location = 'ofs-live-test/test2'
bucket_name = '%s/historical'%location
list_files = fs.ls(bucket_name)
list_files = [_ for _ in list_files if '.csv' in _]
for i,file in enumerate(list_files):
    with fs.open(file,'r') as f:
        df = pd.read_csv(f, parse_dates=['system_time_utc'])
ShiruiH commented 3 months ago

@kvos Thanks for the snippet. I will integrate them into the scripts.

For uploading from local (or maybe VM) to GCS Buckets. We can use gsutil.

ShiruiH commented 2 months ago

Scripts added:

  1. Download_original_tiles_Landsat.js: download Landsat tiles
  2. Download_original_tiles_S2.js: download Sentinel2 tiles
  3. 02_Upload_polygon_mask_to_bucket.ipynb: upload processed tiles from 01_Create_polygon_mask.ipynb to GCS bucket. This script also contains an optional cell to delete all images in the bucket folder for re-upload
  4. 03_Upload_bucket_to_EE_asset.ipynb: upload polygon masks from bucket to GEE Assets with their specified properties
  5. (optional) 04_Reset_EE_asset_collection.ipynb: delete the asset for re-upload

Hi @kvos

I've tested the abovementioned scripts on my local machine. Please go ahead and test them on your side and let me if any amendment needed. Cheers!

kvos commented 2 months ago

@ShiruiH , I tested the Landsat and Sentinel-2 images generated from GEE with notebook 01 and it's all working. I compared the rasters against the vector geometries and the masks from USGS EArth explorer and they are the same. image image

ShiruiH commented 2 months ago

Thanks @kvos! I will close this issue then.