CMIP-Data-Request / Harmonised-Consultation-Phase-1

All public discussion related to CMIP7 Harmonised Data Request Phase 1 Consultation
https://wcrp-cmip.org/cmip7/cmip7-data-request/
Creative Commons Attribution 4.0 International
1 stars 0 forks source link

Data Request Variables - Standard Ocean Layers for CMIP7 #24

Open CMIP-Data-Request-coleads opened 4 days ago

CMIP-Data-Request-coleads commented 4 days ago

Themes:

Description

OMDP would like to specify a set of levels which are consistent with the literature (to keep these low dimensional). 

CMIP6 had one ocean layer dimension (`olayer100m`) and a range of depths coordinates at 0, 100, 300, 700, 2000m. Initial proposal from Baylor is:

0-200m (for consistency with Gruber et al. https://doi.org/10.1038/s41586-021-03981-7) and Li et al. 2016 0-300m (for consistency with bathythermograph records) 0-700m (for consistency with first-generation Argo records) 0-2000 (for consistency with latest-generation Argo records) Total Depth (for energy & thermosteric analysis, spin-up, etc.)

durack1 commented 4 days ago

It would be useful to map these levels to available observational ocean heat content (OHC) analyses. Will drop a July 2024 email thread blurb into the below

ping @baylorfk @adcroft and would add Helene and John if I could find github handles

From: Durack, Paul J. Date: Thursday, July 25, 2024 at 9:39 AM To: Helene Hewitt, baylor, john.dunne Cc: Eleanor O'Rourke Subject: Re: Spin-up WG Meeting 2

If we wanted to target the anomaly fields, these are a subset of the below – so in WOA18 the 102 levels (0-5500 m or 137 0-9000m) are dropped to 26 levels, with the familiar granularity which gives us vertical subselection of 0-50, 0-100, 0-300 [MBT data cuts out here], 300-700 [XBT data cuts out here], 700-2000, after which it stops due to the drop out due to argo data coverage loss., and 3 monthly coverage becomes far more temporally sparse (goes from 3 monthly averages to annual, or pentadal with salinity).

P

It’s a while since I have updated my obs data, so notes to self

WOA18 26 levels 0-2000m, with 16 levels 0-700 m level = 0, 10, 20, 30, 50, 75, 100, 125, 150, 200, 250, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1750, 2000 ;

temperature anomaly availability (here) 1955-2004 0-700 m (16 levels; 3month, yearly) 2004-present 0-2000 (26 levels; 3month, yearly, pentadal)

Salinity anomaly availability (here) 2005-2023 0-2000 m (26 levels; 3month, yearly) 1955-2023 0-2000 m (pentadal)

From: Durack, Paul J. Date: Thursday, July 25, 2024 at 9:11 AM To: Helene Hewitt, baylor, john.dunne Cc: Eleanor O'Rourke Subject: Re: Spin-up WG Meeting 2

Just so we’re not talking 2009, rather 2023/today, it would be useful to drop the WOA23 levels into this discussion – this moves from the WOA09 40 levels (0 to 9000 m here) Screenshot 2024-09-19 at 12 23 26 PM

to the WOA23 137 levels (0 to 9000 m here) Screenshot 2024-09-19 at 12 23 52 PM

For most models the 137 will give us too many levels to choose from, which avoids, somewhat the interpolation discussion. If having the above 137 levels in copy-able text, let me know and I’ll pull this out of a ncdump.

Baylor, this may be fodder for the Fox-Kemper et al OMIP paper…

P

durack1 commented 4 days ago

Just taking a look at the coordinates in the CMIP6-CMOR-Tables, we do have some atmospheric layer examples to build on, so for e.g., atmospheric layer composites for 16 specified layers alt16 (here) (also alt40), in addition to the atmospheric specified level outputs, e.g. atmospheric level selections plev19 (here).

Interestingly, we have a couple of already defined depth levels, depth0m (here), and what appears to be a defined depth100m (here), plus depth2000m, depth300m, depth700m which I don't believe were ever used. I also note what appear to be some inconsistencies in their values, so these would need a tweak if we planned to use them

pinging @taylor13 as there may be more backstory here

adcroft commented 4 days ago

0-200m (for consistency with Gruber et al. https://doi.org/10.1038/s41586-021-03981-7) and Li et al. 2016 0-300m (for consistency with bathythermograph records) 0-700m (for consistency with first-generation Argo records) 0-2000 (for consistency with latest-generation Argo records) Total Depth (for energy & thermosteric analysis, spin-up, etc.)

In addition to 0-700 and 0-2000, Zanna et al. (https://www.pnas.org/content/early/2019/01/04/1808838115) has "2000-below". We could calculate that last as a residual (e.g. "2000-below" = "total-depth" - "0-2000") but the value of diagnosing all the other overlapping ranges directly is it simplifies the down-stream use (fewer mistakes and inconsistencies). On the other hand, there's a simple pattern for the proposed intervals which is that they all start at the surface.

adcroft commented 4 days ago

Can we clarify: are there two or three suggestions in this thread? One to use some defined diagnostic ranges, another to have defined sample depths (not a vertical grid), and perhaps another to standardize the vertical grid?

I think the suggested ranges for averages, or content, diagnostics represent well the typical ranges found in analyses and additionally line up with products; each of the ranges suggested by @baylorfk ends on a WOA depth. Depth ranges are handily very specific and force us to implement a vertical average or integral (i.e. cell_methods="depth:mean" or "depth:sum").

The above is in contrast to a different notion of using a standardized vertical grid for 3D data (xyz). I'm not sure if that is being suggested in the discussion but if it is then a challenge with using the WOA depths is they are really only consistent with the point-wise interpretation (i.e. cell_methods="depth:point") which readily allows for vertical interpolation but is not conservative. If we want a standard vertical grid, my recommendation would be to use the WOA depths (ideally not too fine, meaning WOA'09) as the bounds for a common diagnostic vertical grid so that the cell centers would be midway between the listed depths. This allows conservative remapping from model-grid to a common diagnostic grid (happy modelers) that can then be compared to a linearly (vertically) interpolated WOA, or equivalent, data (happy-ish users).

Finally, if we want some diagnostics at some specific depths (e.g. depth100), then we would need to ask modelers to ensure using interpolation when generating the diagnostic, and the cell_methods="depth:point", so that we are comparing like quantities. I think that some "surface" fields submitted to CMIP6 were actually vertical averages over the top-layer, or some depth, and those choices vary between models.

(Sorry if I over interpreted the discussion)

taylor13 commented 4 days ago

Alistair, I'm coming to this late, but it's not clear to me how defined sample depths can't define a vertical grid. I would think the values at these depths would be consistent with cell_methods="depth:point", similar to the observed values reported on the WOA depths.

I agree that for models, you want to conserve when you go to standard levels and that for the model output then, cell_methods="depth:mean", which, given the size of model errors probably doesn't hurt you when you compare to observed "point" values.

Finally, I'm not sure CF will allow you to store an integrated quantity extending from the surface to various depths in a single variable. (I have raised a question about this here. Wouldn't storing the integrated quantity between successive layers (e.g., 0-200, then 200-300, then 300-700, 700-2000, 2000-10000) allow you easily calculate whatever vertical integral you are interested in? If you're interested in vertical means rather than integrals, you would likely want to weight by mass and you'd need to know the mass of each layer (which could be easily approximated with only small error except for the deepest layer where you would have to determine the depth of the ocean floor).

If all of this has already be considered, please ignore (but I would be curious about exactly what problem is being addressed). Karl

adcroft commented 4 days ago

@taylor13 Yes, excellent points. Now I realize I wasn't thinking straight and was implicitly assuming we would have a different variable name per range - which is crazy.

I'm coming to this late, but it's not clear to me how defined sample depths can't define a vertical grid. I would think the values at these depths would be consistent with cell_methods="depth:point", similar to the observed values reported on the WOA depths.

Indeed, I was forgetting we could just have a "coarse" vertical grid for the point-wise sample depths.

Wouldn't storing the integrated quantity between successive layers (e.g., 0-200, then 200-300, then 300-700, 700-2000, 2000-10000) allow you easily calculate whatever vertical integral you are interested in?

I fully agree! Storing successive non-overlapping layers would meet all our needs. It allows us to implement via a very coarse finite volume grid and use well proven tools, while the over-lapping ranges does not naturally fit CF. I came at this looking at the OP specific suggestion but you've found a very elegant solution!

If you're interested in vertical means rather than integrals, you would likely want to weight by mass and you'd need to know the mass of each layer (which could be easily approximated with only small error except for the deepest layer where you would have to determine the depth of the ocean floor).

Yes, and I believe the existing variable definitions and methods will suffice here too. All we need do is agree on a list of depths.

So in short, I mis-interpreted what the OP was suggesting. Karl's reply makes it clear that we need only define a vertical grid..

martinjuckes commented 4 days ago

@durack1 Adding to your comment above, I have to apologise on behalf of the CMIP6 Data Request. depth100m was a depth level, and olayer100m was the layer from 0 to 100m. So far so good. Unfortunately, depth300m, depth700m are also layers, measured from the surface, rather than simple levels. They have been used for the thetaot varianble (Depth Average Potential Temperature) which was fairly well used in CMIP6.

The name of the variable avoids any ambiguity, but the naming of the layers was confusing.

baylorfk commented 4 days ago

Hi All, Good discussion thus far. I think the remaining point on defining a grid just has one entry: add 200m in addition to the CMIP6 levels.

As to the vertical integrals, it doesn't seem to matter much if they are non-overlapping or cumulative, the point is just to be able to construct the typical ranges, which can be done either way by simple sums. There is a slight difference in data request size, as an opportunity might request only 0-2000m, for example (1 field) which is 4x smaller than 0-200+200-300+300-700+700-2000. This same fact argues that whichever way they are defined we probably don't want a large number of layers, unlike Paul's earlier email on WOA (although the later boundaries should be WOA compatible). I think the cumulative layers would win slightly in the size category.

Finally, it should be made clear in the definitions that these are pressure levels, not geodesic levels, which makes it easier in z* models and not too hard in z (just add ETA). I think pressure is also going to be closer to observational practice using ctd measurements for depth. Interpolation is obviously going to be necessary as varying vertical grids are unlikely to have these specific layers and levels. I think we want to leave the method of interpolation to the modeling centers, as it will vary based on numerics (e.g., 2nd order finite volume schemes would be different from 4th order finite difference for conservation preservation).

With these in hand, I think some new physical parameters should be added, e.g., heat and salt content, maybe also carbon content? Other suggestions?

So, should we invite the data request team and the omdp for comments here? I'm mindful that deciding by Oct 4 sounds better based on Martin's request.

baylorfk commented 4 days ago

PS Helene suggested that we ask Pierre Mathiot in addition to OMDP and data request folks.

baylorfk commented 4 days ago

Sounds like olayer and depth naming needs disambiguation as well.

martinjuckes commented 3 days ago

@bayloffk : yes, that CMIP6 confusion does need to be fixed. If salinity on a small number of layers can reduce the need for storing data on model levels it will be a great saving.

There were 79 3d ocean fields requested in Omon, including both organic and inorganic carbon. Perhaps all these could be requested on layers (if they are being requested again) and the usage of 3d fields could be restricted to opportunities where there is a specific need for more detailed analysis?

baylorfk commented 3 days ago

Yes, I think that's what we're hoping. Quite a few of the 3d fields could be chopped into horizontal grid * 5 layers instead, for an ~10-fold size savings. In addition, we don't need to have the thkcello variables at high frequency if these are already calculated, which would be a big savings for Ocean Extremes where we wanted daily 0m and 200m (pressure-level) values. Cheers, -Baylor

Baylor Fox-Kemper Professor of Earth, Environmental, and Planetary Sciences Brown University @.***, fox-kemper.com 401-863-3979

On Fri, Sep 20, 2024 at 10:44 AM Martin @.***> wrote:

@bayloffk : yes, that CMIP6 confusion does need to be fixed. If salinity on a small number of layers can reduce the need for storing data on model levels it will be a great saving.

There were 79 3d ocean fields requested in Omon, including both organic and inorganic carbon. Perhaps all these could be requested on layers (if they are being requested again) and the usage of 3d fields could be restricted to opportunities where there is a specific need for more detailed analysis?

— Reply to this email directly, view it on GitHub https://github.com/CMIP-Data-Request/Harmonised-Consultation-Phase-1/issues/24#issuecomment-2363191087, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABY7FTC3MKFSDWOGOTL2KYDZXPN6JAVCNFSM6AAAAABOQFOMDCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRTGE4TCMBYG4 . You are receiving this because you were mentioned.Message ID: <CMIP-Data-Request/Harmonised-Consultation-Phase-1/issues/24/2363191087@ github.com>

taylor13 commented 3 days ago

I said above "Finally, I'm not sure CF will allow you to store an integrated quantity extending from the surface to various depths in a single variable." That question has now been answered here. It's o.k. for coordinate "cells" to overlap. CF would allow us, if we like, to save in a single variable, a vertically integrated quantity (e.g., heat content) extending from the surface of the ocean to various depths. We might, for example, define depth as the vertical coordinate and give each coordinate value bounds [0, depth], so that each cell covers the region of interest.

That being said, I would still favor non-overlapping cells which could be summed to produce a total over several layers.