cf-convention / vocabularies

Issues and source files for CF controlled vocabularies
3 stars 1 forks source link

Standard names: temperature differences (not anomalies) #90

Closed larsbarring closed 4 years ago

larsbarring commented 4 years ago

Proposer's name Lars Bärring. Date 2020-10-09.

There is currently no standard name that can be used to describe the well established climate indices (aka derived statistics or sometimes climate indicators) diurnal temperature range (DTR) and extremal temperature range (ETR). Diurnal temperature range is a well established concept with wide applications. In the specific context of climate indices it is typically understood as the monthly mean of the daily ranges. ETR is the within-period temperature range where the period typically is a month. While they can be given their own specific standard names, I suggest a more generic approach:

Term: air_temperature_difference Definition: Air temperature is the bulk temperature of the air, not the surface (skin) temperature. The cell methods specify how the difference is calculated. As this is a temperature difference the unit Kelvin equals the unit degree_C. Unit: K

There are already three similar standard names: sea_water_sigma_t_difference (kg m-3), sea_water_sigma_theta_difference (kg m-3), and sea_water_temperature_difference (K), but how these differences are calculated is not clear from their definitions only. And change_over_time_in_air_temperature does not work because the temperature change is taken between the two times given by the time_bounds, and thus does not allow for maximum difference in temperature between any two times within the time_bounds.

Using this standard name for the climate index DTR would imply the following cell methods dtr: cell_methods = ”range within days time: mean over days”. Correspondingly, for ETR it would simply be etr: cell_methods = ”range”.

Note that in the definition I suggest to alert the users, and software developers, to the fact that in this particular context Kelvin and degree Celsius are the same. I think this is prudent because CF relies on udunits for handling units, and alone it does not have a mechanism to handle this special case. Whether a similar sentence should be added to all other relevant standard names could be the topic for another issue.

JonathanGregory commented 4 years ago

Dear @larsbarring

As you correctly say, the range of a quantity can be described by cell_methods. Therefore I don't think you need a new standard name in this case; you can use air_temperature with cell_methods of range to describe the air temperature diurnal range.

Best wishes

Jonathan

larsbarring commented 4 years ago

Dear Jonathan,

Yes, I see your point, and actually, in the beginning I thought along the same lines. But then I came to the conclusion that a temperature difference is something else than a temperature itself. That is, I think that they are not commensurate entities, and the fact that for a temperature difference 1 K = 1 C suggests to me that there is case for distinguishing between them. Conceptually extending the meaning of the existing standard name air_temperature to cover also air temperature differences I fear may just dilute the precision of the term in a way that is not helpful in the long run.

Kind regards, Lars

sebvi commented 4 years ago

Hi Lars and Jonathan,

I don't think the term difference is appropriate for this use case and using range seems to be the correct way. That said, a parameter with standard_name air_temperature and cell_method of range does not capture the very specific nature of DTR and ETR. Maybe the way to go is to propose 2 new standard names for these.

On a side note, I have always wondered why difference or anomaly were not valid cell_method, I think it is extremely common to compute anomalies.

On a side note 2, 1°C is not equal to 1K but Δ°C = ΔK

taylor13 commented 4 years ago

I think we've followed the principle that the standard_name is describing what is being measured (in this case temperature), whether or not what is reported is an absolute or relative measure (and regardless of the units used). With this understanding, the maximum temperature of the day measured relative to the minimum temperature of the day (i.e., the diurnal range) is still a temperature.

JonathanGregory commented 4 years ago

The standard name isn't the entire description of the values in the data variable; the cell_methods is another essential part of it. cell_methods indicates which statistic (mean, SD, variance, maximum, range, etc.) is being used to characterise the quantity named, which in general is variable within the cell. This doesn't change the standard name, which identifies the quantity whose variation is described.

Difference and anomaly aren't cell_methods because they involve comparison with another variable, in general, containing the standard for comparison. For a similar reason, correlation and covariance aren't cell methods, although they are statistical operations. However, the anomaly computed with respect to the mean of the same variable could be described as a cell method, I think; I don't recall that having been requested.

ngalbraith commented 4 years ago

It's true that cell methods allow you to use a standard_name, 'whether or not what is reported is an absolute or relative measure', but I think this is a big problem for data aggregation. In this case, the value is not an air temperature, and using that standard name may lead to misuse of the data. I realize that data users should look at cell methods, but the fact is that they sometimes do not, because air temp means, medians, mins and maxes are still air temperatures.

I'm having the same problem with winds that are relative to surface currents, and am hoping to propose a new name for those shortly. That's a different case, because there's another observed data variable involved, but still ... similar idea with implications for data misuse.

larsbarring commented 4 years ago

Dear all,

Thanks for all the comments. Let me just briefly respond to some of them:

@sebvi I think that in particular diurnal_temperature_range, as a general and widely used concept rather than the specific climate index, may be a good candidate for a separate standard name. But I suggest that this is a different issue.

@taylor13 I do not think that stating that a range is a relative measure quite capture its essence because there is nothing like a [reasonably] fixed reference. I think that it is more of a measure of spread or variation, as opposed to an anomaly that indeed is a relative measure.

@ngalbraith Thanks for pointing out the conceptual difference between cell methods like mean, median, minimum, maximum, mode, mean_of_upper_decile that are statistical measures of location, and cell methods like range, standard_deviation, root_mean_square that are statistical measures of variation or spread.

As these methods (and others that do not fit as measures of variation or spread) since long time is collected under the CF umbrella of cell_methods I plan to close this issue in few days if there are no more comments.

Lars