ECMWFCode4Earth / challenges_2023

Discover the ECMWF Code for Earth 2023 challenges
53 stars 5 forks source link

Challenge 22 - Discovering hidden patterns on Climate Data Store #6

Open EsperanzaCuartero opened 1 year ago

EsperanzaCuartero commented 1 year ago

Challenge 22 - Discovering hidden patterns on Climate Data Store

Stream 2 - Machine Learning for Earth Science

Goal

CDS produce a wide set of transactional records and operational logs which contains a lot of hidden information that would represent a very valuable insight to better understand and predict system patterns, user behaviours and preferences, and early warnings, ... this could result in improvements in the system and more dynamic configuration (QoS).

The aim of the project is to explore what Ml/AI can bring to reveal this information and how this could be later applied for CADS Operation.

Mentors and skills


Note: Only nationals from European Union (EU) Member States and countries associated with EU’s Space Programme (currently Iceland and Norway) are eligible to participate (see Terms and Conditions).


Challenge description

Currently, the information obtained about users is based on very generic indicators and graphs. Going deeper into the exploration of data and logs is done case by case when particular issues or requests need to be addressed.

Currently, the number and volumes of transactions and data are such that these operations become more and more complicated.

Data/System to use

Climate and Atmosphere Data Stores transactional information (user requests) is supported by a Postgres DB. Operational information from the system components is registered in different logs.

Both sources of information are indexed on Splunk in almost real-time. Information can be directly exploited via Splunk or exported to be used in other environments.

Solution

Applying ML/AI models to the data collected by the system will allow to the extraction of hidden knowledge about user patterns, and cause-effect issues,...

This knowledge will allow us to better understand the system, put in place more dynamic configuration (QoS), tune the system, implement new features on the system, inform users, and organise the catalogue structure, ...

Ideas for the implementation