Closed roy-lowry closed 4 years ago
@roy-lowry I'm happy that we remove the bit about 'typically in excess of 10 years'. My text was really just to help refocus the discussion rather than provide a formal definition but it's good to bring all these ideas together. I'm also happy with the other improvements you have proposed.
@larsbarring I agree entirely, this should be applicable to both observations and model data, so I'm in agreement with your latest version of the definition.
One very trivial thing (a consequence of cut-and-paste, I think) - in the first sentence 'have' should be 'has'.
One final thought - The current text doesn't require that bounds exist, yet the whole point of this standard name is to define a period of time. In an earlier draft @davidhassell suggested the following sentence:
"If the variable has no bounds, a plain-language description of the epoch (e.g. 1960-1978) may be included in the long_name attribute, if appropriate."
Should we reinstate this?
@DanHollis I would have no objection to that.
"If the variable has no bounds, a plain-language description of the epoch (e.g. 1960-1978) may be included in the long_name attribute, if appropriate."
Hi @DanHollis, I don't agree, I'm afraid. If you know the epoch well enough to write it down in plain text, then you should use bounds.
@DanHollis, @roy-lowry, @davidhassell If the reference epoch is a period of time, and not a point in time, it is better to have the start and end points of the period properly encoded as bounds. If this for some reason is not appropriate or possible the data writers are as always free to use the long name as they see fit to guide data users. I think there is no need to specifically invite to such a "back-door solution" in the definition without a concrete use-case -- there might well be one that I am not aware of. But does this mean that the definition should encourage, or even mandate, the use of bounds if it is a period and not a point in time?
@DanHollis Seeing the reactions I suggest we drop the idea.
That's fine - I'm happy to leave it out, and I agree that inviting alternative approaches (even within the context of a single standard name) is not a good idea. The only reason I mentioned it was because the current wording says "If [it] has bounds...", implying that it might not have bounds (which didn't quite fit with it being a "period of time"). Encouraging or mandating bounds is a possibility but I can't think of a simple way to tweak the text. I guess it should be obvious to users that if they want to define a period of time then they have to provide bounds (given this is how it's done for existing time-related coordinates). If they genuinely want to specify a single point in time (without bounds) because their baseline is a single data point then that remains a possibility. So, I'm happy we stick with the wording we have.
@davidhassell I've just realised that the sentence about including "a plain-language description" originated in the wording at the start of this thread. Apologies if I implied that it was you that proposed it (given you're clearly not keen on it!).
Hello all,
Wow, what a discussion! I have been following this the entire time and admittedly getting lost for most of it! But I have learnt things and that's the important bit. Thank you to everyone for your perseverance, suggestions and solid effort. It looks like, maybe, we have come to some agreement here. The latest version I have is this:
Term: reference_epoch Definition: The period of time over which a dataset representing a parameter has been summarised (usually by averaging) in order to provide a baseline against which other data representing the same, or other parameters may be compared. If a coordinate, scalar coordinate, or auxiliary coordinate variable with this standard name has bounds, then the bounds specify the beginning and end of the time period over which the data aggregation was determined. It is not the time for which the actual measurements are valid; the standard name of time should be used for that. Units: s
Do let me know if the above is not the most recent version! Are there any further comments or anyone who would like to continue the discussion? I purposely haven't commented until now as I do not want to halt ongoing discussions. If they do need to continue then please carry on. If not, please let me know that we have reached an agreement (or simply like this comment if you prefer). I do like this definition I have to say.
Thanks all,
Fran
@feggleton I like this, and I can certainly live with it! So, I do not want to ruin what seems like good convergence of thoughts or to prolong the discussion. At the same time, and given that there seems to be some support for keeping the option open to have a point in time as the reference epoch, I think that it is helpful to be explicit about what the lack of bounds implies. My additions in italics:
Term: reference_epoch Definition: The period of time over which a dataset representing a parameter has been summarised (usually by averaging) in order to provide a baseline against which other data representing the same, or other parameters may be compared. If a coordinate, scalar coordinate, or auxiliary coordinate variable with this standard name has bounds, then the bounds specify the beginning and end of the time period over which the data aggregation was determined. If bounds are missing it effectively means that the reference epoch is a point in time. The reference epoch is not the time for which the actual measurements are valid; the standard name of time should be used for that. Units: s
A justification for having a point in time as a reference epoch might be a situation where there a dataset consists of multiple measurements, or model simulations, targeting a specific point in time, and this dataset is used to create the reference data (datum). But I readily admit that I do not know of a concrete use case for this.
@DanHollis Not a all! I think I suggested that long_name when I was still confusing the reference itself with the reference epoch...
In a similar way, I think that the use of the terms "data" and "dataset" in the description are a bit confusing. it's also good to include the words from the name in the description. I might suggest:
Term: reference_epoch Definition: The period of time over which a parameter has been summarised (usually by averaging) in order to provide a reference (baseline) against which data has been compared. When a coordinate, scalar coordinate, or auxiliary coordinate variable with this standard name has bounds, then the bounds specify the beginning and end of the time period over which the reference was determined. If the reference represents an instant in time, rather than a period, then bounds may be omitted. It is not the time for which the actual measurements are valid; the standard name of time should be used for that. Units: s
I've lost track if the tidal-specific case is still on the table, I'm afraid, but it's worth noting that in the tidal case, the standard name of the data (tidal_sea_surface_height_above_mean_higher_high_water) tells us that there is a baseline, so we can interpret the values sensibly. In general, such standard names do not exist, but some do. For example, there are only 8 standard names for climate anomalies:
air_pressure_anomaly | Pa | | 26
air_temperature_anomaly | K | | 25
brightness_temperature_anomaly | K | |
geopotential_height_anomaly | m | | 27
ratio_of_sea_water_potential_temperature_anomaly_to_relaxation_timescale | K s-1 | |
ratio_of_sea_water_practical_salinity_anomaly_to_relaxation_timescale | s-1 | |
sea_water_temperature_anomaly | K | |
surface_temperature_anomaly | K | |
If we go for the general case, or in any event, should we also be saying a reference epoch coordinate should only be used in association with data that represents a comparison with a reference (of any type - not necessarily a climatological one)?
@ethanrd I'm interested in the grid mapping idea, but my first thought is that storing the reference ellipsoid (say) in a grid mapping is in fact not the right place afterall, because the grid mapping is only for datums of the horizontal and vertical coordinates, not the data, such as sea_surface_height_above_reference_ellipsoid
. I would like to talk about this more (I'm not sure I understand it all yet!), but as @roy-lowry says, perhaps on another issue?
Thanks, David
@davidhassell This wording of the definition is better !
I'm still happy, if a little exhausted. I'm not supposed to be working, ;-)
@davidhassel Yes, I agree. The tidal datum and grid mapping discussion should continue in a new issue (or perhaps back in issue cf-convention/vocabularies#74?) and not hold up this reference epoch discussion.
should we also be saying a reference epoch coordinate should only be used in association with data that represents a comparison with a reference (of any type - not necessarily a climatological one)?
This seems well stated in the first sentence of the definition (“provide a reference against which data has been compared”). Does it need to be more explicit?
I was also wondering ... Does the reference_epoch
definition need to explain how a variable that represents a comparison (from a reference value) would specify the reference epoch? Or does that need to be done in each of the anomaly standard names?
I'm happy with the latest version proposed by @davidhassell - I think it's a definite improvement on what we had yesterday.
Regarding how this standard name will be used, my expectation was that each anomaly standard name would include a sentence explaining this. I think it's clear from the description that it is intended to be used as a coordinate variable, and I also think it's clear that it relates to anomaly data, so I don't think we need to add anything more (but maybe we should ask someone who hasn't been party to these discussions...).
Although not raised so far in this discussion, I have assumed that a dataset which contained values of the reference itself (e.g. 30-year temperature normals or 19-year averages of lower low water) would have a coordinate with standard name time
and climatological bounds. Is that correct?
Hi all,
@ethanrd I agree that no more needs to be said about only using with data that has been compared
@DanHollis - I agree that each anomaly standard name should have some text describing how to define the epoch of the reference; and you're right about the reference data itself having the normal time
coordinate.
@JonathanGregory is this latest OK with you (i.e. not being tidal-specific)?
Thanks, David
Dear @davidhassell
Yes, I'm happy with this proposal and wording. Thanks for bringing it together. Clearly the consensus is for a generic name, and I accept that, if it's going to be used for all cases. That is, the tidal datum reference epoch will be a reference_epoch
- correct? I'm glad to avoid the word "datum", for sure!
Following on my earlier remark to @ethanrd but please read with caution because I am not an expert and am ready to be corrected! I believe that a tidal datum like "mean lower low water" would not be used in geodesy, because it's not a precisely defined surface like a reference ellipsoid. It's something you have to measure or calculate with a tidal model, in either case with some uncertainty. UK maps give altitude as "above mean sea level at Newlyn". I believe that this means the vertical distance above a reference ellipsoid which approximates a geopotential surface which passes through a level fixed to the land that was chosen to indicate mean sea level at Newlyn. Since the actual mean sea level does not follow a geopotential surface, and deviates even more from a reference ellipsoid, it is not "mean sea level" everywhere, while "mean sea level" is a "tidal datum" everywhere. This is why I think "tidal datum" and "chart datum" (for navigation) are not the same sort of geodetic datum which we describe with grid_mapping
.
Best wishes
Jonathan
Hi all,
Thank you so much for getting this agreed and sorted! Much appreciated. So, it looks like we have agreed on the below:
Term: reference_epoch Definition: The period of time over which a parameter has been summarised (usually by averaging) in order to provide a reference (baseline) against which data has been compared. When a coordinate, scalar coordinate, or auxiliary coordinate variable with this standard name has bounds, then the bounds specify the beginning and end of the time period over which the reference was determined. If the reference represents an instant in time, rather than a period, then bounds may be omitted. It is not the time for which the actual measurements are valid; the standard name of time should be used for that. Units: s
Then it looks like a separate discussion is needed about tidal datum and grid mapping - this should probably be a new issue unless it effects cf-convention/vocabularies#74 in which case please continue there. Otherwise, I think reference_epoch is ready to be accepted. Please let me know if the above is not the most recent agreement or if there are any further comments.
Thanks all
I think a new ticket is required as ticket cf-convention/vocabularies#74 only covers a subset of tidal datums.
Hi all - I started a new issue (#79) for further discussion of tidal datum and grid mapping. I tried to summarize the discussion so far. Hopefully its not too far off.
@davidhassell - I didn't include your point about grid mapping being for datums of coordinates as opposed to for data because I don't understand it. I was a bit confused by your reference to sea_surface_height_above_reference_ellipsoid
because this sentence in its definition
To specify which reference ellipsoid is being used, a grid_mapping variable should be attached to the data variable as described in Chapter 5.6 of the CF Convention.
seems the perfect example of a vertical datum for data being specified in a grid mapping. Which is what I thought you were arguing against. Could you add a comment in the new issue to raise and clarify this point?
Just to follow up on the previous list of current standard names for which the reference_epoch
is relevant
air_pressure_anomaly | Pa | | 26
air_temperature_anomaly | K | | 25
brightness_temperature_anomaly | K | |
geopotential_height_anomaly | m | | 27
ratio_of_sea_water_potential_temperature_anomaly_to_relaxation_timescale | K s-1 | |
ratio_of_sea_water_practical_salinity_anomaly_to_relaxation_timescale | s-1 | |
sea_water_temperature_anomaly | K | |
surface_temperature_anomaly | K | |
here are three more that specifically mention the word climatology in the definition
change_over_time_in_sea_water_absolute_salinity
isccp_cloud_area_fraction
sea_water_absolute_salinity
@davidhassell were you suggesting to add a sentence encouraging the use of ´reference_epoch` in the definition of these standard names?
Hi @larsbarring,
were you suggesting to add a sentence encouraging the use of ´reference_epoch` in the definition of these standard names?
I think so, in the same way that the tidal standard names have done. Following that lead, we could perhaps add to those standard names that you identify:
"To specify the reference (baseline) epoch to which the quantity applies, provide a scalar coordinate variable with standard name reference_epoch."
(caveat - I'd like to reiterate @ethanrd's observation that we should remember to pay attention to cf-convention/discuss#79 in case that changes things!)
@davidhassell I think that the sentence you suggest to add captures it all. Regarding possible links to cf-convention/discuss#79 I think that @japamment is spot on in her comment under cf-convention/vocabularies#74 :
Thanks for drawing attention to issue cf-convention/discuss#79. If I understand correctly, that issue is regarding the vertical datum for the sea_surface_height whereas cf-convention/vocabularies#188 was discussing the time period over which the value of the vertical datum is established.
The reference_epoch term has been added into v76 of the standard name table update.
I have added issue cf-convention/vocabularies#27 to add the epoch sentence to the names listed above. I will close this issue now if that's ok.
Proposer's name Roy Lowry Date 8 August 2020
During the discussion of tidal sea surface Standard Names (#57 ) the point was made that different averaging intervals were used for the determination of reference levels used in different data sets and that these averaging intervals (epochs) should be recorded within the CF NetCDF file.
Whilst the epoch could be included as scalar variables with bounds labelled using the long_name attribute it is felt that providing a Standard Name raises the visibility of the metadata element thereby increasing usage.
The proposed Standard Name, description and units are:
- Term tidal_datum_epoch - Description A specific time period over which tide observations are taken and reduced to obtain mean values (e.g., mean lower low water, mean sea level) to provide reference levels (tidal datums) for subsequent water level measurements. The standard name is used for an ancillary scalar time variable that may be linked to any sea level variable referenced to a tidal datum. It is suggested that the scalar variable be set to the epoch midpoint. Bounds to specify the start and end of the epoch are mandatory. It is recommended that a plain-language description of the epoch (e.g. 1960-1978) be included in the long_name attribute. - Units Seconds