ECMWFCode4Earth / challenges_2019

Have a look at the challenges proposed for the 2019 edition of ECMWF's Summer of Weather Code.
45 stars 6 forks source link

Challenge #12 - Machine learning for predicting extreme weather hazards #14

Closed jwagemann closed 3 years ago

jwagemann commented 5 years ago

Challenge 12

Machine learning for predicting extreme weather hazards

Goal: To use ECMWF/Copernicus open datasets to evaluate machine learning (ML) techniques to better predict one specific kind of an extreme weather event, e.g. drought or hurricanes; provide templates for future ML work


Mentors @cvitolo, @StephanSiemen, @jwagemann
Skills required - Data Science
- Experience in building machine learning algorithms
- Knowledge of meteorological and climate data and formats desirable
- Knowledge of extreme weather hazards desirable

Challenge description

This challenge is of an explorative nature. The aim of this challenge is to have a better understanding of the feasibility, accuracy and challenges of using ECMWF/Copernicus open datasets to better predict extreme weather events.

Possible datasets available:

A potential open dataset available by ECMWF / Copernicus is e.g. the climate reanalysis product ERA5. It extends back to 1979, has a global spatial resolution and an hourly temporal resolution. But we also have data on fire risk, air quality or floods.

Possible approach

A possible approach could be:

Depending on the extreme weather hazard chosen and the algorithm, there are different possible outcomes, e.g.

Since this challenge is very explorative, we would like to have a detailed documentation of the single steps taken. It would be further valuable to have a detailed description how datasets should be prepared. We would like to get a better understanding of the current challenges / limitations machine learning with weather / climate data entails.

Potential questions that can be explored:

lkugler commented 5 years ago

Hi, what a nice challenge! We'd apply for this one but to prepare a detailed plan we would need some details concerning the observational data for wildfires and floods, which is new to us, unlike ERA5. For example: what kind of database/dataset is it, how is one given access to it, how many years of data are there? Thanks!

cvitolo commented 5 years ago

Glad to hear you are interested!

If you are interested in wildfires, fire radiative power is observed from satellites since 2003 (see CAMS GFAS https://apps.ecmwf.int/datasets/data/cams-gfas/). Burned areas are available from third party data provider (GFED4 https://www.globalfiredata.org/data.html).

We also have a fire danger forecasting system that also serves valuable information.

Data are available in Grib or NetCDF. You can access ECMWF archive using the Web API ( https://confluence.ecmwf.int/plugins/servlet/mobile?contentId=22907869#content/view/22907869 ).

On Wed, 13 Feb 2019, 11:28 lkugler <notifications@github.com wrote:

Hi, what a nice challenge! We'd apply for this one but to prepare a detailed plan we would need some details concerning the observational data for wildfires and floods, which is new to us, unlike ERA5. For example: what kind of database/dataset is it, how is one given access to it, how many years of data are there? Thanks!

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/esowc/challenges_2019/issues/14#issuecomment-463145555, or mute the thread https://github.com/notifications/unsubscribe-auth/AEhdr97-3nnYsSAHcgfzpmmdRcHuxZk0ks5vM-jLgaJpZM4aH89S .

jwagemann commented 5 years ago

Hi, to add information to the flood data: The flood data can be provided from the Global Flood Awareness System (GloFAS). Here is an overview of the data that is available on request. We will be able to make the data available and provide a ftp access to the data. The data is available in NetCDF.

masterflorin commented 5 years ago

Hi! Wow and congrats to ECMWF for organizing a summer of code. I took part in a similar competition several years ago (GSoC) under OSGeo umbrella. I enjoyed it a lot and it gave a me good start in terms of open source software development. πŸ‘

I wonder is there any comprehensive repository or database that has links towards various datasets on climate/weather data? I'd be interested in attempting a solution that uses multiple data sources something like data fusion but I hardly have any experience at doing that. I also need to get re-acquainted with NetCDF, Sentinel data as I haven't touched them recently. πŸ”¨

One of the things that I appreciate in your above posted message is the nature of the challenge, there are so many ways to go about it which makes it so intriguing.

jwagemann commented 5 years ago

Hi @masterflorin , thanks for your interest in this challenge. Open climate/meteorological datasets are available via

One great dataset is the ERA5 reanalysis available from the Copernicus Climate Data Store. Let us know if you have specific questions to the data. We are happy to help.

See links to Flood and Fire data in the thread above. HTH, Julia

masterflorin commented 5 years ago

Thank you for that lighting-fast response @jwagemann.

Great! I'll check those out.

-FlorinC

tommylees112 commented 5 years ago

Hi Julia, we would be very interested in exploring the possibility of predicting drought. Something that I have questions about are the problems of defining drought because there are lots of variables which drive and respond to drought conditions. Would this be something you would like us to explore in the application form? Thank you so much for setting these up we are extremely excited about getting involved! Tommy

jwagemann commented 5 years ago

Hi @tommylees112 , yes please. Drought prediction with machine learning is exactly in the scope of this challenge. We are looking forward to your proposal.

tommylees112 commented 5 years ago

Hi @jwagemann

The description states:

Set up the machine learning model and evaluate the results, e.g. based on a set of extreme weather hazards in the past (e.g. ECMWF has a database with past extreme weather events)

Is it possible to see the database with past extreme weather events or at least view the metadata so we can see the kind of information about each past event that you have?

Thanks so much for your help! Tommy

jwagemann commented 5 years ago

Hi @tommylees112 , unfortunately it is not possible to make the database publicly available. However, we have:

  1. a severe weather events database
  2. a severe event catalogue, and
  3. a fire events database

Regarding 1: This datatype holds some information on geographical area of the event, nature of the event (e.g. data outage, excess of thresholds, etc.), severity of event (e.g. severe).

Regarding 2: this is a collection of past sever events, e.g. the heatwave in Europe in 2018. It is a more detailed description with analysis of data in order to better replicate and understand the event. From this catalogue, we could pick some drought (as an example, but also floods, etc.) events from the past to set up and validate the machine learning model.

Regarding 3: If fire data is of interest, we have a collection of severe fire events since 2018. Here, we also have the geographical area of the event and a description of the event as well as threshold surpassed.

I hope this helps. Julia

jwagemann commented 5 years ago

REMINDER: Deadline to register and submit your proposal is upcoming Sunday, 21 April at 23:59 GMT!

Application process is a 2-step process:

Applications without a submitted proposal will not be taken under consideration! We are looking forward to your proposal!

ppalmes commented 5 years ago

Assuming the proposal is accepted, is it possible to publish a paper out of this work? If yes, any requirement?

jwagemann commented 5 years ago

Hi @ppalmes , yes, of course. We even embrace it, as this challenge is very explorative and more research is needed on this topic. The only requirement is to use open data from Copernicus / ECMWF. HTH, Julia

melioristic commented 5 years ago

Dear Assignees

I came to know about this challenge today. And can I have some guidance on the topic for flood forecasting. I want to do a catchment level flood forecasting and plan to use sentinel 2 data and DEMs to model the same. What kind of data is available, as flood labels. I will try to use 4 bands of sentinel [B2,B3,B4 and B8] and then DEM and finally radar precipitation data [from some source] to make flood predictions using Machine Learning.

In short, what kind of flood data I can have access to, being not a participant of the challenge? As the competition deadline is over, Is it possible to get this project as my Master Thesis. I am pursuing MSc in Environmental Engineering at ETH Zurich, Switzerland

P.S. ETH provides an excellent opportunity to do master thesis in association with organisations/institutions.

Yusuf-Oluwatoki commented 5 years ago

Hi @jwagemann,

I am very interested in this research project and I would like to know if it is still open to submit proposals. I have just concluded my Master's essay on modeling extreme climate events for impact studies. A further step I want to go is using state of the art techniques to validate results from stat models like GEV AND GPD. Let me know if the challenge is still open or I can proceed with some personal research as regards this challenge.

jwagemann commented 5 years ago

Hi @gapton76 and @melioristic , thanks a lot for your interest in this ESoWC challenge. Unfortunately, the applications for this year are now closed and we already identified three projects that will work on predicting drought, fire and floods with different machine learning approaches. Please watch this space and follow us on Twitter to stay up to date with all future announcements.

You can specifically follow the machine learning projects on Github(drought, fire and flood) and also get in touch with the teams to discuss their work.

Cheers, Julia

melioristic commented 5 years ago

@jwagemann Thanks for the information. Will surely get in touch with the team and see what I can learn from them and how I can contribute to this field.