ShiruiH / EOWater

A fast way to retrieve water surface area time-series from Sentinel-2 and Landsat in GEE
GNU General Public License v3.0
22 stars 7 forks source link

EOWater: efficient cloud computing of water surface area time-series from Sentinel-2 and Landsat in GEE

This repository presents an efficient GEE-based solution for mapping water surface area time-series in waterbodies from Landsat and Sentinel-2 imagery.

While many solutions exist to map waterbodies, this toolkit presents the following advantages:

drawing

The development of this tool was supported through funding from the Australian Government (Murray-Darling Basin Authority).

The tool was developed jointly by the NSW Department of Climate Change, Energy, the Environment and Water and NGIS.

The toolkit is also catalogued in the SEED Portal as the remote-sensing-earth-observation-water-toolkit.

The tool is currently maintained by @ShiruiH and @kvos.

Table of Contents

Installation

To use this tool you will need access to a Google Earth Engine (GEE) project. You can create one at https://signup.earthengine.google.com/. Then go to https://cloud.google.com/sdk/docs/install and install the gcloud CLI. After you have installed it will automatically launch and let you authenticate with your GEE account (or personal gmail).

To install the Python environment, install Anaconda (https://www.anaconda.com/download/). Then open the Anaconda Prompt and type in the following commands:

conda create -n eowater
conda activate eowater
conda install -c conda-forge geopandas -y
conda install -c conda-forge earthengine-api scikit-image rasterio matplotlib notebook folium -y

Then type jupyter lab and navigate to the notebooks in this repository.

Usage

1. Download Sentinel-2 and Landsat original tiles

Two downloading options are available:

2. Create polygon masks

01_Create_polygon_mask.ipynb: notebook to generate the polygon masks for Landsat and Sentinel-2 tiles using a waterbodies boundaries vector layer. A set of input polygon boundaries are provided in the repo for the NSW Northern Basin as an example in waterbodies_boundaries.geojson, courtesy of the National Resources Access Regulator (NRAR).

The inputs for this script is the Sentinel-2 and Landsat tiles downloaded from Step 1. The script creates a .tif file for each tile with a mask where each individual polygon is assigned a different value, which allows the process to distinguish them at a raster level.

drawing

3. Upload masks to GEE Assets

Once the polygon masks have been generated in Python, they need to be uploaded as cloud assets into GEE. You can follow the instructions below to perform this step. Two options are available, with different usage scnarios.

If only a few images need to be uploaded (e.g., fewer than 3), the Option #1 manual process is recommended. This method avoids the need to set up Cloud Storage access authentication.

For uploading a large number of images, the Option #2 automated process is more efficient.

Option #1 manual process 1. Go to https://code.earthengine.google.com/, sign in and select your cloud project (in this example `nsw-dpe-gee-tst`). 2. Click on NEW > GeoTIFF Image Upload. Select your file in /outputs (e.g., `outputs/Sentinel2_tiles_mask/T55JGH_20231213T001111_B02.tif`).

drawing

3. Once uploaded, click on the asset and it should show up like in the screenshot below:

drawing

4. Click on Edit then on the PROPERTIES tab and Add property. Add a property called Tile with value 55JGH (or different tilename). This property is needed later on.

drawing

5. Repeat for the Landsat tiles, but add two properties, PATH and ROW with their respective values (example below for tile 090081).

drawing

6. Once all the individual tiles have been uploaded, click on NEW > Image Collection and create an image collection for Sentinel-2 (named it `Base_Sentinel2_tiles`) and for Landsat (name it `Base_Landsat_tiles`).

drawing

7. Then drag and drop all the invididual tiles into their respective image collection (Sentinel-2 or Landsat). The image collection should look as below (17 tiles in that example):

drawing

8. Finally, upload the image labels which were saved in [/outputs](/outputs). Click on NEW > CSV file and select the file `outputs/labels.csv` (or Landsat one, they are the same). Call the asset `Base_labels`.

drawing

You should get a table that relates each unique polygon id to an integer value, like shown below:

drawing

Option #2 automated process 1. Upload polygon masks to Google Cloud Storage (GCS) Buckets. (1) Install the [`gcloud`](https://cloud.google.com/sdk/docs/install) CLI accordingly. (2) The easiest way is to use [02_Upload_polygon_mask_to_bucket.ipynb](02_Upload_polygon_mask_to_bucket.ipynb) to upload the polygon masks to a Google Cloud Bucket. __OR__ if you are familiar with this, use directly [`gcloud storage`](https://cloud.google.com/storage/docs/discover-object-storage-gcloud): ```sh # authenticate gcloud log in, make sure you have the necessary permissions to access the GCS Buckets gcloud auth login gcloud storage cp -m -r -n [LOCAL_PATH] gs://[BUCKET_NAME]/[DESTINATION_PATH] ``` 2. Ingest polygon masks from Buckets into GEE Assets using [Image Manifest Upload](https://developers.google.com/earth-engine/guides/image_manifest). (1) [Install the Earth Engine Python client](https://developers.google.com/earth-engine/guides/python_install). (2) [03_Upload_bucket_to_EE_asset.ipynb](./03_Upload_bucket_to_EE_asset.ipynb): create an ImageCollection, `Base_Sentinel2_tiles` and/or `Base_Landsat_tiles`. Ingest the polygon masks into the ImageCollection with the specified properties for each polygon mask. 3. Upload the image labels which were saved in [/outputs](/outputs). Click on NEW > CSV file and select the file `outputs/labels.csv` (or Landsat one, they are the same). Call the asset `Base_labels`.

drawing


(Optional) If polygon masks in GEE Assets need to be removed and re-uploaded. Use 04_Reset_EE_asset_collection.ipynb to batch remove all tiles.

:warning: Check that you have these 3 assets uploaded:

Now you are all setup to map water surface area time-series in GEE!

4. Run GEE scripts in Code Editor

The scripts are found in GEE_scripts and can be copied into the Code Editor and run there. They will output a set of CSV files with the time-series of water surface area for each polygon. The following scripts are available:

  1. WSA_monitoring_S2.js: map water surface area on Sentinel-2 images.
  2. WSA_monitoring_L9.js: map water surface area on Landsat 9 images.
  3. WSA_monitoring_L8.js: map water surface area on Landsat 8 images.
  4. WSA_monitoring_L7.js: map water surface area on Landsat 7 images.
  5. WSA_monitoring_L5.js: map water surface area on Landsat 5 images.

The tileList in the scripts needs to include only the available tiles in Base_Sentinel2_tiles or Base_Landsat_tiles.

(Optional) Additionally, there is a Python script WSA_scheduled_cloud_function.js that can be setup as a Cloud Function to process Sentinel-2, Landsat 9 and Landsat 8 imagery as a cron job.

5. Postprocess water surface areas

05_Postprocess_timeseries.ipynb: notebook to postprocess the time-series of water surface area generated in GEE and includes the following steps:

This can be a useful tool to monitor water resources in a catchment.

Contributing and Issues

Having a problem? Post an issue in the Issues page (please do not email).

If you are willing to contribute, check out our todo list in the Projects page.

  1. Fork the repository (./fork). A fork is a copy on which you can make your changes.
  2. Create a new branch on your fork
  3. Commit your changes and push them to your branch
  4. When the branch is ready to be merged, create a Pull Request (how to make a clean pull request explained here)

References and Datasets

This section provides a list of references on this topic.

  1. Public talk presenting the water monitoring tool by Mustak Shaikh and Kilian Vos: VIMEO recording (from minute 19:10)
  2. Methodology intro by Shirui Hao: EOWater methodology on YouTube