WCRP-CMIP / CMIP6_CVs

Controlled Vocabularies (CVs) for use in CMIP6
Creative Commons Attribution 4.0 International
159 stars 80 forks source link

Frequency missing: monPt #413

Closed martinjuckes closed 7 years ago

martinjuckes commented 7 years ago

There are 4 variables in the request which are specified as monthly values with time: point: sistrave, sistrmax, sishevel, sidivvel.

Can you add monPtto the frequency CV?

durack1 commented 7 years ago

@martinjuckes thanks for the tipoff. We've also got some yrC (yearly climatology) data for another satellite project (input4MIPs) so will discuss with @taylor13 to see if the inclusion of these two new frequencies can be fast-tracked.

@martinjuckes do you foresee any other required changes in the pipeline?

taylor13 commented 7 years ago

Before adding this frequency, we need confirmation that this is really what they want (and what their justification is). The four variables are:

sistresave -- average normal stress in sea ice (first stress invariant) -- SN = average_normal_stress sistremax -- Maximum shear stress in sea ice (second stress invariant) -- SN = maximum_shear_stress sishevel -- Maximum shear of sea-ice velocity field (second shear strain invariant) -- SN = maximum_shear_of_sea_ice_velocity sidivvel -- Divergence of sea-ice velocity field (first shear strain invariant) -- SN = divergence_of_sea_ice_velocity

Only the last variable has an accepted standard name. The other standard names have not been accepted (and I think they are probably unacceptable -- shouldn't include words like "mean" and "maximum"; that's indicated by cell_methods.)

The only variables I can think of that might be sampled as point variables on monthly and longer time-scales would be measures used to monitor reservoir amounts (e.g., total soil moisture).

If you tell me who requested these variables, I will contact them and ask them to justify "point". thanks, Karl

martinjuckes commented 7 years ago

@taylor13 These requests come from SIMIP, Dirk Notz [dirk.notz@mpimet.mpg.de]

@durack1 I've scanned all the variables, and monPt is the only thing missing. I'm intrigued by yrC though: how does an annual climatology differ from a multi-year mean?

durack1 commented 7 years ago

@martinjuckes the yrCwill be as follows, with the other entries grabbed from the current WCRP-CMIP/CMIP6_CVs/CMIP6_frequency.json:

...
"dec":"decadal mean samples",
...
"mon":"monthly mean samples",
"monC":"monthly climatology computed from monthly mean samples",
...
"yr":"annual mean samples",
"yrC":"annual climatology computed from annual mean samples",
taylor13 commented 7 years ago

@martinjuckes : to answer your specific question of how a multi-year time-mean differs from yrC, it doesn't!. I think the multi-year time mean is needed for input4MIPs. I can think of two options:

define frequency = "mean" and in the file names the time-label would extend from the first year to the last year (e.g., for a 10-year mean from 2000 through 2009 we would have "2000-2009").

define frequency = "yrC" and in the file names the time-label would be as above followed by "-clim" (e.g., "2000-2009-clim") for consistency with other climatological fields.

I lean toward the first option because it is more general (e.g., it could be used for means extending over fractions of years, with the time-label adjusted to provide more precision), but it seems a bit odd to characterize the "frequency" as "mean".

durack1 commented 7 years ago

@taylor13 @martinjuckes just to note we already have in the WCRP-CMIP/CMIP6_CVs/CMIP6_frequency.json:

"dec":"decadal mean samples",

Registered, which is in effect a yrC for a 10-year period.

taylor13 commented 7 years ago

Denis tells me that PrePARE cannot yet check that the frequency recorded in the global attribute is consistent with the sample interval in a file. It only checks that the frequency is found in CMIP6_frequency.json. If we add "yrC" to the CMIP6 frequency CV, then someone could enter it and pass PrePARE's check. Thus, bad files would get published. Also, I think having it appear in the CMIP6 CV might confuse modeling centers, who wouldn't be able to find that frequency in the data request. I therefore think we should not add "yrC" at this time.

taylor13 commented 7 years ago

"dec" is something different from my definition of "mean" because several samples may be included (one for each decade), where as "mean" is a by definition a single sample. You're right that "dec" reported for a single decade would be the same as the "mean" over that decade.

martinjuckes commented 7 years ago

But in CMIP data a monthly mean is a sequence of monthly values, so I would expect a decadal mean to be a sequence of decadal values. Incidentally, I've just checked (because we are discussing climatologies) and realised there is a misunderstanding over the definition of the mean diurnal cycle in the CMIP6 data request, which is encoded as a climatology from a CF perspective -- I've raised and issue (#414 ). This is also a bit odd, but I think that it is clear that it has to be a climatology from a CF perspective, even though it is going to be a mean diurnal cycle to most users. I guess the same is true with the yrC case: the users are going to call it a climatology, in the CF file metadata it will be a simple time mean and you have to make a choice for the CV.

taylor13 commented 7 years ago

so do all our current "climatology" files include the "climatology" attribute in the netCDF files (and no other files include it)? If so, perhaps we should continue that "convention" and not call the time-mean a climatology since it won't have that attribute. Also, we wouldn't include the "-clim" suffix is the file name on these files. Is "mean" the best alternative?

taylor13 commented 7 years ago

For the record, here is the justification for monPt (in reverse chronological order):

Dear Karl,

Yes these are the four variables that are needed in order to compare with obs and to check how sea ice models are being run in GCMs.

Thanks for doing that

Bruno

On Thu, Sep 28, 2017 at 5:01 PM, Karl Taylor wrote:

Dear Bruno,

Thanks this is indeed helpful.

The 4 variables we are aware of that should be sampled once each month are:

sistresave -- average normal stress in sea ice (first stress invariant)
sistremax -- Maximum shear stress in sea ice (second stress invariant)
sishevel -- Maximum shear of sea-ice velocity field (second shear strain invariant)
sidivvel -- Divergence of sea-ice velocity field (first shear strain invariant)

Is that what you need to compare with obs?

I think the velocity and thickness are being requested as monthly-means and also as daily means.

Although the modeling groups may have this data available as snapshots, they will have to write special coding to collect it and archive it as monthly "point" samples.  That's what will be extra work for them (compared to monthly means, say).

We'll leave the 4 variables above in the data request, sampled as snapshots at the beginning of each month (which I think will be easier for them compared to any other time during the month).

best regards,
Karl

On 9/28/17 11:57 AM, Bruno Tremblay wrote:
Dear Karl and Dirk,

The reason for requesting the stress invariants (maximum and mean stress at a point) was to check first if the momentum equation is solved accurately (i.e. are the stress invariants lying on the yield curve or not), to determine the fraction of the grid points that are deforming plastically versus elastically; to compare the PDF of strain rate invariants with those derived from radarsat at 12.5km resolution (RGPS - we were also asking for the strain rate invariants in addition to the stress invariants). All this information will allow people to get a lay of the land in terms of how sea ice models are used and how they compare with available high resolution satellite observations - in addition to comparing GCM sea ice drifts with those of buoys for instance.

If a group is not prepared to output snapshot, then I would suggest not outputing anything at all. I cant think of an analysis that I would do using monthly mean values except to compare the monthly mean values from one model with another but we would have nothing to compare against in the obs.

Regarding the 4 required variables (I am assuming here that you are referring to the stress and strain rate invariants?). These 4 variables (together with ice velocity, thickness at different time step) can be used to calculate most of the term in a Power (input and dissipated) budget of sea ice. I am not sure we would do this simply because if we wanted to do such a budget we would request snap shots of the same data for an entire month say in winter and one in summer at least).

Regarding the trouble associated with outputing snap shots of model variable being tedious. I was under the impression that GCM modeling group routinely save snap shots in case the model crashes and it needs to be restarted during the run - or even scheduled termination of a long simulation that is broken into many segments. So I would not have thought that this was an issue (maybe I am mistaken here),

I hope this helps. If you have more questions, please do not hesitate to contact me

Bruno 

On Thu, Sep 28, 2017 at 2:15 PM, Karl Taylor  wrote:

    thanks again for your quick response.  I'll wait to hear from Bruno.

    I would like to eliminate this sampling strategy (of point values once per month), since no other variables are collected in that way and modeling groups will have to develop special coding just for the 4 variables.  If the point values are essential, of course, then modeling groups should go to the trouble, but if there are ways to estimate the budget sufficiently accurately relying on monthly means, that would be much preferred.

    best,
    Karl

    On 9/28/17 10:55 AM, Dirk Notz wrote:

        Bruno, can you briefly explain what sea-ice dynamics folks can do with
        these instantaneous values? My understanding of this particular topic is
        probably not sufficient to provide a meaningful answer.

        Thanks a lot,

          Dirk

        ----
        Dr. Dirk Notz

        Am 28.09.2017 um 19:47 schrieb Karl Taylor:

            Hi Dirk,

            This information is very helpful.  Thanks!  (I had thought mean and max
            referred to the processing of time samples.)

            I would like to probe a bit more about the "point" monthly samples
            needed for the 4 variables.  I know for exact budget studies you
            normally need instantaneous values, but I'm surprised you only need 4
            such values to do the momentum budgets for sea ice.  Do these 4 terms
            sum to the rate of change of the sea ice momentum?

            thanks again,
            Karl

            On 9/28/17 10:10 AM, Dirk Notz wrote:

                Dear Karl,

                note that I copy in Bruno Tremblay who can comment much more informed
                than I ever could on your first two questions. Bruno, please add or
                correct anything in my explanation as you feel necessary:

                    sistresave -- average normal stress in sea ice (first stress invariant)
                    sistremax -- Maximum shear stress in sea ice (second stress invariant)
                    sishevel -- Maximum shear of sea-ice velocity field (second shear strain
                    invariant)
                    sidivvel -- Divergence of sea-ice velocity field (first shear strain
                    invariant)
                    1)  Should sistresave and sidivvel be averaged over the month? If not,
                    how are they sampled and why wouldn't a time-mean be appropriate? 

                All four variables should be reported as instantaneous values at the
                same point in time, at any time of the month. Only such instantaneous
                values do allow us to understand the momentum budget described by these
                terms.

                    2)  Should sistremax and sishevel report the maximum values from data
                    sampled every time-step?  If not, how are they determined? 

                The term 'Maximum' only describes what this variable captures within the
                respective shear or stress tensor. It is not to be understood as a
                'maximum' value in any temporal or spatial meaning of the CF convention.

                    3)  I think only the last variable has an accepted standard name. The
                    other standard names have not been accepted (and I think they are
                    probably unacceptable -- the standard name shouldn't include words like
                    "mean" and "maximum" because that is normally indicated by
                    cell_methods.)   Have you checked which variables have accepted standard
                    names and which do not?  If so, please let us know which ones do not. 

                It must have slipped my attention that we did not request standard names
                for these three variables. As far as I can tell, they are the only ones
                from our request that we did not propose to the list yet. Should we do
                so now? Note that the maximum and average in this case is not captured
                by the cell methods, as these terms simply describe what this variable
                captures within the tensor.

                I hope this helps for now?

                best,

                   Dirk 
taylor13 commented 7 years ago

@durack1 So, apparently we definitely need to add monPt to the frequency CV. We might need to revise the description of the 1hrCM frequency too and perhaps either "mean" or "yrC", so let's wait and see if we can work those out in the next couple of days.

durack1 commented 7 years ago

@taylor13 with the 01.00.16 release @martinjuckes has implemented the monPt frequency, so we'll need to get this updated tomorrow to make sure things are synchronized across the projects

taylor13 commented 7 years ago

Yes, please add monPt to the CV.

durack1 commented 7 years ago

@taylor13 already done (#415) take a peek at WCRP-CMIP/CMIP6_CVs/CMIP6_frequency.json, we need to resolve the yrC question above

martinjuckes commented 7 years ago

On the question of encoding, I believe that the climatology attribute is present for the monthly climatologies and the mean diurnal cycle. The file names and attributes document (https://drive.google.com/file/d/0B-X2uY_FGt7XTDhtWnRpNmhpUnc/view ) explicitly states that the -clim suffix on a file name should be used when, and only when, there is a climatology attribute on the time axis: so, by that rule, it will be present for mean diurnal cycles and absent for the decadal means.

taylor13 commented 7 years ago

@martinjuckes Yes, I agree that the –clim suffix should only be used when the “climatology” attribute is needed.

We should note that a careful calculation of an annual mean climatology would differ from a straight mean if some monthly samples were missing in the record (e.g., at the beginning of the first year or the end of the last year, or even some months in the middle. To get a representative annual climatological value, you would want to weight each month of the year equally, so you would first calculate monthly climatologies, and then form the mean of the 12 months (weighting each month by the number of days in the month). This would in general not be the same as forming a mean over all the monthly values in the time-series (even if each month is weighted in the same way). For example, if most winter month temperature data were missing (but at least one sample were available for each calendar month), the straight time mean would yield a higher temperature than a climatological annual mean.

In the case of model output, we can insist that the mean be computed over complete years of data (say starting in January of the first year and ending in December of the last year, with no months missing), so the time mean and climatological annual mean will be identical.

For input4MIPs, I therefore recommend that we simply ask for a time-mean, and label the frequency “mean” (not “yrC”).

durack1 commented 7 years ago

The yrC is not required, so this issue can be closed