ECMWFCode4Earth / challenges_2024

Discover the ECMWF Code for Earth 2024 challenges
46 stars 4 forks source link

Challenge 12 - Data visualisation for the CDS and ADS #5

Open RubenRT7 opened 4 months ago

RubenRT7 commented 4 months ago

Challenge 12 - Data visualisation for the CDS and ADS

Stream 1 - Data Visualization and visual narratives for Earth Sciences applications

Goal

To produce a number of operational web-applications which will allow visualisation and exploration of the datasets on the CDS and ADS without the need to download lots of data

Mentors and skills


Challenge description

The Climate Data Store and Atmospheric Data Store are undergoing modernisation and are scheduled for production release in September 2024 (beta release is expected by March 2024). The modernised data stores aim to improve the accessibility of the data via improved data visualisation and exploratory tools, for example the soon to be published Climate-Pulse.

We would like exploratory web-applications for as many of the datasets in the CDS and ADS which would allow users to visualise the data before downloading large quantities. The exploratory applications should follow a consistent design principle, e.g. the layout and interactivity tools. However, every dataset is unique and has its own special requirements and features to highlight, therefore a degree of fine-tuning will be required for each dataset being represented.

We can provide some template JS react applications and the framework to deploy on the web. The successful participants should enhance this template, and then develop a unique application deployment for each dataset. The deployment should consider the resources required for operational purposes, e.g. any processing of data required for visualisation, storage required for processed data, and regular updates to data.

The applications will have a frontend written in JS react and will be deployed via Helm and Kubernetes, so experience with these technologies will be an advantage to candidates applications. Any backend processing code will be expected to be written in Python.

danghieutrung commented 3 months ago

Hi! We are a team currently working on a climate project, and we have some free time in the summer for this challenge. We are crafting a proposal and we have a few questions:

The modernised data stores aim to improve the accessibility of the data via improved data visualisation and exploratory tools, for example the soon to be published Climate-Pulse.

  1. Is integration of the JS app into Climate-Pulse a part of this challenge? If yes, will the app be readily available to be integrated into the Cliamte-Pulse system or the integration requires some extra work?

We can provide some template JS react applications and the framework to deploy on the web.

  1. Do we get to know in advance some of the features already implemented in the template? We speculate it contains basic features like dropdown bar, zooming, etc.

    However, every dataset is unique and has its own special requirements and features to highlight, therefore a degree of fine-tuning will be required for each dataset being represented.

  2. From our understanding, we will manually implement unique features for every dataset? If correct, is it feasible regarding the time length of the project (4 months) and the abundance of datasets? Or there should be joint features for datasets of the same group (datasets with the same tags, e.g. Health, Global, Reanalysis). Can you give us some examples of a unique feature for a particular dataset?

  3. Are all datasets either stored in csv, xlsx, etc. (data table format) or netCDF files? Is there any other type of dataset files?

We may have a few more questions in the future. Thank you!

cguz commented 3 months ago

Hi @danghieutrung,

My name is Cesar and I'm reaching out along with @mairitikk.

We're currently looking for a team to join, and your project particularly aligns with our skills and interests. We believe we could make valuable contributions and would be excited to discuss how we could fit in.

Would you be open to integrate two more members to your team?

Thank you!

JamesVarndell commented 2 months ago

Hi @danghieutrung,

Many thanks for your questions! I will do my best to answer them below.

  1. Is integration of the JS app into Climate-Pulse a part of this challenge? If yes, will the app be readily available to be integrated into the Cliamte-Pulse system or the integration requires some extra work?

Climate Pulse is a standalone web application which visualises near-real-time ERA5 reanalysis data. This application was mentioned in the project description as an example of the kind of visualisations we expect from this project, but is not itself part of this project.

  1. Do we get to know in advance some of the features already implemented in the template? We speculate it contains basic features like dropdown bar, zooming, etc.

We tend to use open-source ReactJS frameworks like SemanticUI or MUI with our own Copernicus themes. We also use tools like OpenLayers for interactive maps, and React-Globe-GL for spinning globes. But we are also flexible and happy to consider other React components.

  1. From our understanding, we will manually implement unique features for every dataset? If correct, is it feasible regarding the time length of the project (4 months) and the abundance of datasets? Or there should be joint features for datasets of the same group (datasets with the same tags, e.g. Health, Global, Reanalysis). Can you give us some examples of a unique feature for a particular dataset?

Good question! We don't expect it to be possible to create data viewers for every dataset in the CDS and ADS, so we would provide a priority order of which datasets more importantly require viewers. In terms of unique features for certain datasets, this could be as simple as extra dropdowns for datasets with multiple height levels, or it could mean a different visualisation method for datasets which are only provided over a small area (e.g. some datasets show very high-resolution data over indeividual European cities). However, there are also lots of global datasets which could likely be grouped into a single viewer "template" - so some datasets will be more similar to each other than others.

  1. Are all datasets either stored in csv, xlsx, etc. (data table format) or netCDF files? Is there any other type of dataset files?

The great majority of the data we will ask you to visualise will be in netCDF or GRIB format, with some in CSV format.

I hope that's helpful - please get in touch if you have any more questions.

Many thanks, James Varndell

danghieutrung commented 2 months ago

Hi @JamesVarndell!

Thank you so much for the detailed response.

Unfortunately, I have checked the terms and agreements and I found that I am not eligible for the program.

Many thanks and best of luck with the project!