alan-turing-institute / environmental-ds-book

A computational notebook community for open environmental data science 🌎
https://edsbook.org
Creative Commons Attribution 4.0 International
98 stars 21 forks source link

[NBI] Automatic sea ice segmentation in synthetic aperture radar images #258

Open louisavz opened 1 month ago

louisavz commented 1 month ago

What is the notebook about?

This notebook introduces the use of unsupervised learning on synthetic aperture radar (SAR) data for sea ice classification. In particular, we are interested in ice floes which are any contiguous piece of sea ice. The size of the floes within a region can provide important climatic information, and we aim to quantify this characteristic. The target geographical area is Weddell Sea of West Antarctica. The example here will focus only on Weddell Sea, although this should be scalable to other regions of interest if given the appropriate training dataset.

Packages used in this notebook will include core python functionalities, as well as

This notebook will examine:

This notebook will not examine:

Data Science Component

Submission type

Programming language

Checklist:

Additional information

Paper in preparation regarding existing publication(s).

acocac commented 1 month ago

@louisavz thank for the submission 🙌

The notebook idea and outline sound great!

Can you provide further details of the target geographical areas? You mention you won't use the preprocessing in the SNAP tool. I wondered if you've considered open-source alternatives if they exist. Also, I suggest avoiding expensive training procedures in your notebook. EDS book uses open infrastructure incl. Binder so most existing notebooks load pre-trained models for inference.

louisavz commented 1 month ago

@acocac Thank you for the feedback and comments.

The target geographical area is Weddell Sea of West Antarctica. The example here will focus only on Weddell Sea, although this should be scalable to other regions of interest if given the appropriate training dataset.

SNAP is open-source and I'm using a script from BAS to do the preprocessing. That step is run on BAS HPC so I thought to skip that and provide example data that has already been corrected and calibrated. Perhaps I could provide a reference link to the tool kit and the script (if it is open-sourced)? I can include the steps used for preprocessing, and one can use the SNAP UI tool if they have a Windows machine.

You are absolutely right, I will provide the pre-trained models for inference rather than having to train the model here. Should I revise the above from

to

I have also adjusted models to model, as I will only focus on one unsupervised model to keep this short and concise. I hope this helps, please let me know if I can provide more details!

acocac commented 1 month ago

@acocac Thank you for the feedback and comments.

You're welcome. The purpose of the notebook idea stage is to provide feedback to consider in your first working version of the notebook.

The target geographical area is Weddell Sea of West Antarctica. The example here will focus only on Weddell Sea, although this should be scalable to other regions of interest if given the appropriate training dataset.

This is great - Please remember to mention this in the context section.

SNAP is open-source and I'm using a script from BAS to do the preprocessing. That step is run on BAS HPC so I thought to skip that and provide example data that has already been corrected and calibrated. Perhaps I could provide a reference link to the tool kit and the script (if it is open-sourced)? I can include the steps used for preprocessing, and one can use the SNAP UI tool if they have a Windows machine.

Sharing the key steps would be beneficial for the reader/user of your notebook. It's great that SNAP is open-source, I'm aware about other emerging programmatic, scalable alternatives such as eo_tools and xarray-sentinel using modern open-source Python tools like GDAL/Rasterio, Xarray, Dask, and GeoPandas. However, it seems SNAP is the most used library by EO researchers.

You are absolutely right, I will provide the pre-trained models for inference rather than having to train the model here. Should I revise the above from

  • Applying models to training data

to

  • Walk through of the model
  • Apply example test set to a pretrained model

Thanks for adding the step. Pre-trained models usually work well. You can indeed register them in scivision if relevant.

I have also adjusted models to model, as I will only focus on one unsupervised model to keep this short and concise. I hope this helps, please let me know if I can provide more details!

This looks good to me. Looking forward to hearing about the outcome of your publication.

louisavz commented 1 month ago

@acocac Thank you for the feedback :). I have updated the description with your suggestions.

We had a co-working session at BAS exploring the use of Dask for tiling of large images recently. I would love to utilise the modern open-source Python tools you have mentioned above more consistently.