Open jwagemann opened 2 years ago
Caveat: I have not seen a dataset of wave weather yet so I do not know if the proposal I am making is even possible. Having a look at one (is there a standard somewhere?) may make me change my mind and hat I describe here is a very free sketch of a thing.
So I think with this problem a Fourrier decomposition of the wave spectrum would be of use and it would make a reasonable basis for storing 2D data temporally, You could have a couple of 2D Cartesian matrices, where one stores the "ground swell frequency and amplitude" and another one would store a "wind waves frequency and amplitude". The grid would be arbitrarily chosen and I guess this is a different discussion that must be had, just as the Fetch and duration of the "Sea-state". Once you have these values stored over a field one could think about how to display them:
One way wold be to have two heat maps, one for each wave type, where the low color would be the minimum "trough" amplitude and the other the maximum "crest" amplitude. There would then be a discrete grading in between those values. You could display both of these separately or you could blend them in a heat-map mix mode, with what I would hope to be some very interesting displays.
Adding more wave modes to this display model is just a matter of adding more matrices
bi-variate maps might be of use in this context if you can quantize metrics. This goes for many other challenges as well I think.
I have a mock example coded here for R (python should be able to handle things similarly given some wrangling): https://bluegreenlabs.org/post/map-building-3/
bi-variate maps might be of use in this context if you can quantize metrics. This goes for many other challenges as well I think.
I have a mock example coded here for R (python should be able to handle things similarly given some wrangling): https://bluegreenlabs.org/post/map-building-3/
An image is worth a thousand words ;)
The convention is to present a coloured radar chart for the wave spectrum at each location. We can move the mouse cursor between grid points and a chart will show up. But it would be difficult to show how the spectra change spatially.
Let's limit this visualization to a route, e.g. a polyline. This route must be predetermined, either sketching interactively or importing from a GeoJSON file. The radial contour graphs of the spectra (at the grid points) are generated accordingly.
Then when we hover the mouse along this polyline (the cursor should be "snapped" to the line), the contour at the present point is displayed, and the contour for the previous grid point we just passed appears dimly in the background (i.e. with alpha transparency reduced).
(Yes, there are line graphs with this kind of transparency - all lines can appear dimly in the background. But I think wave spectra graphs will clutter up the figures if we show them all.)
Just grabbing another one from the bag of tools. One way to deal with spectra is textural ordination, basically a dimensionality reduction of the (2D) frequency domain using a PCA where you map the 3 first PC to RGB colours. This is basically a trivariate map of the dominant factors of the frequency domain.
This is used on for example the 2D FFT components of a regridded or moving window data to map the texture of vegetation (see https://github.com/bluegreen-labs/foto). At least for the frequency domain this allows you to capture spectra in fewer dimensions. Dropping one PC and swapping it for wave direction, or adding an alpha level could provide a way to deal with complex multi-dimensional data in the 2D map domain.
@jwagemann would it be possible to link explicitly to products (CDS/ADS and their documentation). Not everyone is familiar with all data products and finding the documentation / or toy data is a barrier of entry.
I like the PC (principal component) approach quiet a bit too, as long as we have some way of keeping track of what frequencies and amplitudes we have extracted at each point and perhaps keeping a record of the data we used to also generate a radar plot from each recorded datapoint. So in the end we could display a sensitive map, as per your PC example, where if you hover over a region a radar plot pops up with what nguyenquangchien describes?
Sorry if I digress a bit into the guts of the subject but from previous experience, one can focus much better on the "How" to display data once you have researched the "What" data you will get. For those interested, I found this compilation of "Common Data Formats Used by Marine Scientists", and... of course there are a lot! (even though many of them are generic and barely database related) https://marinedataliteracy.org/basics/formats.htm. Do we know if we have already settled for one in particular, or if there some that have a clear advantages over the others? Will we need to be able to digest all of these and store them into a single database with a sane(r) data structure? Has this work already been done? A subset that stands out to me are those based on GIS and NetCDF, as I think they are very well suited to the task and kinda future proof...
As you may have guessed, I am new to this weather hackathon thing :)
Most data from CDS / ADS come as netcdf or grib files. Sometimes results are zipped, but that generally isn't an issue. I've little experience with tools such as the cds toolbox as I prefer to do my work offline (not in the ecmwf cloud) as most of my problems require comparatively small datasets.
There are different products which include ocean wave data: https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=form https://cds.climate.copernicus.eu/cdsapp#!/dataset/sis-ocean-wave-indicators?tab=overview
But these don't provide the spectra, or components to define them (as far as I could see, from a quick look).
Sorry if I digress a bit into the guts of the subject but from previous experience, one can focus much better on the "How" to display data once you have researched the "What" data you will get. For those interested, I found this compilation of "Common Data Formats Used by Marine Scientists", and... of course there are a lot! (even though many of them are generic and barely database related) https://marinedataliteracy.org/basics/formats.htm. Do we know if we have already settled for one in particular, or if there some that have a clear advantages over the others? Will we need to be able to digest all of these and store them into a single database with a sane(r) data structure? Has this work already been done? A subset that stands out to me are those based on GIS and NetCDF, as I think they are very well suited to the task and kinda future proof...
As you may have guessed, I am new to this weather hackathon thing :)
A good list, but I think JSON/GeoJSON deserves a place here. GeoJSON may be better than GML. The former is more suitable for programming while is still language-agnostic; the latter is a bit outdated. For binary files, netCDF should replace GRIB, but HDF should also be considered.
HDF is the most complete solution, and it handles compression really easily. I think it would be the best server side. As far as I know, it is at least able to seamlessly read netCDF4 formats and I think it can also write them, so we could export whole datasets to other applications that need this format. Incidentally netCDF has a GRIB conversion tool and I don't seem to be able to find a similar one for HDF. If not available, that would have to be implemented. We really need, I think, to be able to play with the other kids in terms of formats, as we would need to load our data from somewhere. I don't know if this project is someday aimed at being a open source repository of recorded data, directly from the sensors and weather stations. That would be neat.
As for using JSON, that would be a neat way to request/push small datasets to and from clients. Also, this would be ideal for weather stations to send their data to the server for recording. For server side storage I think those formats have to much overhead?
Great, I think at least we have Python as an intermediate platform to convert between netCDF and HDF.
Hey, so I'm also quite new to this hackathon thing, but definitely not new to visualizing all kinds of geographical data in python so let me pop in on this comment:
Sorry if I digress a bit into the guts of the subject but from previous experience, one can focus much better on the "How" to display data once you have researched the "What" data you will get.
From my point of view this very much depends on what "tools" you intend to use to visualize your data and where you want to go with the "tool" (e.g. what's the goal? just export a png or offer a gui to actually interact with the data?)
If you stick to python (and the concern is visualization... not database-management), I think the data-type is really not that important since there's a tremendous amount of libraries that you can use to read almost any data-format you like...
...and just a comment concerning NetCDF vs. HDF: HDF is what's running in the back of NetCDF so in theory HDF can do anything that NetCDF does but not vice-versa.
I think most would agree that Python and HDF are the optimal backend solution and I may have sidetracked the discussion here as this has little to do with visualization. But I do like to think in terms of backward compatibility and future-proofing, as in my experience (not weather related, genomics) that was a big part of creating a successful solution. It has to do with acceptance from existing projects and make them willing to adopt you and collaborate as well as promoting future projects. In a nutshell: don't code yourself into a corner.
Right now, I think that we have established a good basis for both visualization and backbone. I would love to hear other opinions, though.
I want to join the group. Shall we draft the requirements to be completed in 24 hours?
I'd love to put in some of my time into this project too. But right now and for a foreseeable couple of weeks I am in full crunch mode, with very little time to spare. Nevertheless I will try to produce some input if something is required.
Truly speaking, the output of a wave model is the 2D wave spectrum, which describes the distribution of the wave energy in frequency and direction. There are different ways to distill that information in easier to plot parameters (see the IFS documentation part VII, chapter 10). The challenge would be to come up with even more innovative way(s) to plot the relevant part of this 2D distribution for a certain area of interest (a stretch of coastline, oil field,...). This is not trivial.