Closed Acolohan closed 1 year ago
In my opinion, this is a data provider issue, not a DAC issue. It's also is a great example of why the application of correct metadata standards are fundamentally important to the exchange of data across platforms. Here's why:
By design, the DAC NetCDF specification is extremely flexible. Other than few CF metadata standards, the format was designed to be approachable in order to attract as many data providers as possible. Aside from the variables listed below, the DAC specification intentionally does not enforce any variable naming convention via a controlled vocabulary, such as the OceanGliders Parameter Usage Vocabulary. This decision was made to provide the maximum amount of flexibility for our data providers.
The DAC NetCDF specification only requires 7 variables to be included. The inclusion of these variables is required as they are fundamental to the proper serving of the data set by ERDDAP. The following variables are required to be included in submitted files:
Theoretically, a data provider could submit files containing only the variables listed above and the files would be considered valid with respect to serving via ERDDAP.
A fundamental service of the DAC is to provide T/S measurements that are ultimately released to the GTS, making them available for model assimilation. As such, we would strongly recommend that the following variables are also included, however they are not required:
Descriptions of the variables are provided through the use of a variety of controlled vocabularies, primarily CF standard names and UDUNITS2. The use of community accepted controlled vocabularies allows computer programs to properly interpret variable descriptions and convert from one set of units to another.
CF conventions are a minimal set of NetCDF conventions that define a few attributes that must be included in order for a NetCDF files to be considered compliant. The creator of the NetCDF files is allowed the freedom to add any additional attribute to the file or a variables as they see fit. But it's entirely up to the user and these additional attributes are not used by a computer to determine what quantity a variable actually holds. For the purposes of this ticket, the 2 attributes that we are concerned with are:
Assuming we agree on the above, let's take the density variable as an example:
The cf standard name database provides 3 possible standard names for a variable containing density measured in sea water. The following are the standard names and the definitions provided by the CF manual:
For the purposes of this discussion, we are only concerned with sea_water_density and sea_water_potential_density.
Regardless of which standard name is chosen, the canonical units of density are kg m-3, however there may be other units that are allowed as long as they are accepted values in the UDUNITS2 database.
The equation for calculating density can be found in the well known and widely used Gibbs Sea-Water Oceanographic Toolbox and is called gsw_rho. The difference between calculating sea_water_density and sea_water_potential_density is to pass the reference pressure (p_ref) in order to calculate sea_water_potential_density. They are fundamentally 2 different parameters.
So, regardless of whether these 2 parameters are included as 2 different variables (i.e.: density and potential_density) or only one of the parameters is included as the density variable, it is incumbent upon the data provider to include the correct standard name as a user of this data set has no other way to determine which parameter is actually calculated:
density:standard_name = "sea_water_density"
density:units = "kg m-3"
or:
density:standard_name = "sea_water_potential_density"
density:reference_pressure = "REFERENCE PRESSURE"
density:units = "kg m-3"
or:
density:standard_name = "sea_water_density"
density:units = "kg m-3"
...
potential_density:standard_name = "sea_water_potential_density"
potential_density:reference_pressure = "REFERENCE PRESSURE"
potential_density:units = "kg m-3"
The only way the DAC would be able to address this issue on a system level would be:
Since the seaglider data sets in question (example) have included:
density:comment = "Sea water potential density"
density:standard_name = "sea_water_density"
I am assuming the calculated parameter is actually potential density, not density. So the standard_name is incorrect. We (@kerfoot ) can contact the data providers and ask them to include the correct standard name of sea_water_potential_density.
Definitely open to discussing this further, but my inclination is to have this discussion with Dr. Miles, contact the seaglider data providers, and then close as #wontfix.
@kerfoot Go for it!
Per personal conversation with Dr. Travis Miles:
Thanks, I think I agree with your post and strategy of notifying users. Do we have a sense of whether this is all Seaglider users,
or just some data providers? Not sure where/who has the most impact to communicate the error in names to or where/what
group would be the best to market the information to. Happy to discuss, maybe Dan Hayes is worth notifying or just putting it
out as a bulletin in the UG2 Slack? Happy to help however I can.
The take home message is that this is a metadata issue that should be addressed by the data providers and/or glider operators.
I will follow up with Dan Hayes at CyprusSubsea to inquire about proper standard names.
closing.
From: 'Travis Miles' Date: Thu, Jun 22, 2023 at 4:47 PM Subject: density variable has two meanings the Glider DAC To: _NOS Glider DAC Support@noaa.gov [glider.dac.support@noaa.gov](mailto:glider.dac.support@noaa.gov)
We came across a data variable issue in the glider DAC that seems ubiquitous at a first glance. We noticed that all the SeaGliders have the variable “density” with “Sea water potential density” in the metadata, while Slocum and Spray gliders seem to have the “density” variable with “Sea Water Density” in metadata, which actually represent “in-situ density”
So in the DAC the variable “density” is being used for both “potential” and “in-situ” density. I think “potential” density should be a separate variable just like we have “potential_temperature” as a separate variable from “temperature.”
I don’t think it matters for any model data assimilation issues as temperature/salinity are assimilated not density, but for people using the data for validation/inter-comparison or science we probably need to be consistent across platforms. I believe the issue is that default data out of the SeaGlider base station is potential density, while Slocum and Spray are doing in situ density post-processing.
Thanks, happy to discuss further.
Travis Miles, Ph.D.