stcorp / harp

Data harmonization toolset for scientific earth observation data
http://stcorp.github.io/harp/doc/html/index.html
BSD 3-Clause "New" or "Revised" License
55 stars 18 forks source link

Add support for a, b hybrid sigma-pressure coefficients to derive pressure #305

Open StevenCompernolle opened 2 weeks ago

StevenCompernolle commented 2 weeks ago

The use of hybrid sigma-p coefficients from which, in combination with surface pressure, a pressure grid can be derived is in very regular use. E.g., by ECMWF (https://confluence.ecmwf.int/display/UDOC/L137+model+level+definitions)

For several HARP products, this derivation is already implemented at the level of the specific ingestion (e.g., https://stcorp.github.io/harp/doc/html/ingestions/S5P_L2_HCHO.html). The request here is to support the derivation in the operations. I.e., if a product has surface_pressure {time}, hybrid_sigma_pressure_a {vertical}, hybrid_sigma_pressure_b {vertical}, it is possible to derive pressure {time,vertical}.

Motivation: to be able to create HARP-compatible datasets, without having to store pressure {time,vertical}. Several colleagues asked me about this.

StevenCompernolle commented 2 weeks ago

There would also need to be support for hybrid_sigma_pressure_a_bounds {vertical,2}, hybrid_sigma_pressure_b_bounds {vertical,2} as hybrid sigma-pressure coefficients can also be formulated this way. From this a pressure_bounds would be derived.

Thanks Bavo for making the remark.

svniemeijer commented 2 weeks ago

The HARP conventions are primarily meant for the in-memory representation of the data. And it was a very conscious choice to remove all forms of encoding/compression. The data should be immediately usable to allow performing operations, without first having to pre-process the data further.

Having these variables supported in the in-memory representation is also not something we want, since they will be very hard to propagate in any operations that involved the vertical axis (regridding/etc.), so in the end, they will almost immediately be thrown away (and you will have to use the full pressure profile to allow it to be used in operations).

The storage definitions in hdf/netcdf for the HARP data are also primarily meant to provide a persistent storage solution and to allow the chaining of tools to operate on the data. Also in this form, we want to completely remove any encoding/decoding. Intermediate-tools should be able to just read/write the data without actually having to use the HARP software itself and thus also should not have to rely on the HARP software for any decoding operations.

Nevertheless, I do realise that the HARP conventions are now getting considered more and more as an actual storage/archive format. And for archival/distribution purposes, I understand the need for this kind of storage optimisation. However, this would not be something for the actual interface conventions themselves. If we would introduce some kind of format that would combine HARP conventions and specific compression/encoding techniques, then this would have to become its own kind of sub-standard, with a special importer in the HARP software (similar to all the other foreign-format importers) that would decode and remove all these compressed elements automatically as soon as they are read. Defining such a standard would take some consideration though. We should then look into further cases that should be supported here. Introducing such a standard purely for the pressure profile compression would be a bit overkill.

StevenCompernolle commented 2 weeks ago

Ok thanks for the feedback.