ESIPFed / gsoc

Project ideas and mentor guidance for ESIP members to participate in Google Summer of Code.
Apache License 2.0
34 stars 16 forks source link

Configurable, Science Aware Tool for Labelling Clouds #27

Open BenGalewsky opened 4 years ago

BenGalewsky commented 4 years ago

ESIP Member Organization

National Center for Supercomputing Applications (NCSA)


Ben Galewsky

Project Idea

Engineer a jupyter notebook that allows researchers to build custom tools to label clouds from Earth Observing Systems such as the Terra Satellite.

Information for students

Just the generally information for students. See ESIP Student Guide


Clouds are the greatest source of climate change uncertainty. The twenty years of Terra satellite data gives us an incredible opportunity to train sophisticated models to further understand climate processes. In order to develop these models, we need an extensive set of labelled data based on the various instruments housed on the Terra and Aqua satellites. Labelling clouds can be difficult, especially against and snowy backdrop.

This task can be improved by considering images captured in different wavelengths, or from spatial shifts. Further improvements can be gained by considering the physics of the underlying elements.

We believe that the best way to create labelling tools is to provide a highly customizable platform that can be programmed by researchers using the familiar python and Jupyter Notebook environments. We are working with the geoviews python library to build just such a framework.

There is plenty of data available from Terra and Aqua satellites and this approach could be generalized to many other areas of science.

The general idea is to think through the user experience to give researchers powerful tools to label the clouds. There are multiple images available for a given location on the earth. Users should be able to flip through, overlay, or display them side-by-side. You might want to implement a "Magic Wand" style selection to create a polygon on one layer. Then users could adjust the polygon given pixels found on other layers. Assume that scientists will want to create numeric tools to give more hints based on their knowledge of the physical processes.

Technical stuff

Python, Jupyter Notebooks, Conda

Helpful Experience, but not required! Geospatial Information Systems (GIS), Javascript

First steps

Try out our demo from the ESIP Winter Meeting on BinderHub

Watch this talk on the importance and challenges of labelling cloud data

Example Images Here are some examples of images of the same area captured with different instruments or wavelengths.

Learn a bit more about the instruments:

snack0verflow commented 4 years ago

Hi @BenGalewsky just wanted to know if there are any evaluative tasks that I need to do, apart from the ones listed here? Asking in context of GSoC 2020. I have some experience in Deep Learning and would love to work on this project. :))

snack0verflow commented 4 years ago

Also could you please suggest some active repositories in on which I can start contributing. Thanks :)

0xamogh commented 4 years ago

@BenGalewsky Hey! The work going on this organisation seems really interesting. I had a question regarding the above idea. Does it involve exploring and training data or is it more on the lines of building tools which could facilitate others to do ML

BenGalewsky commented 4 years ago

The focus of this project is to develop powerful labelling tools to create the training sets needed by climate science. If you want to use them for developing your own models feel free, but that is not part of this project.

punitgalav commented 4 years ago

Hi @BenGalewsky , I am Punit Galav currently pursuing Masters in Geoinformatics and Natural Resources Engineering of IIT Bombay. I have all the relevant skills required for this project and have studied GIS, Remote sensing of natural resources, Scattering models, Machine learning. I understand the importance of developing such a framework and looking forward to contribute to this project.