nvs-vocabs / ArgoVocabs

A repository for the management of issues related to vocabularies managed by the Argo Data Management Team
7 stars 0 forks source link

R03: add min/max values to each parameter description #57

Open vpaba opened 1 year ago

vpaba commented 1 year ago

On 09-Feb-2023. the @nvs-vocabs/avtt team decided to add the min/max value of each parameter to the description of each R03 concept.

The min/max value information can be found in the Argo physical parameters list: core-Argo and BGC-Argo, February 3rd 2023 - linked through the ADMT website: http://www.argodatamgt.org/Documentation

An very similar issue had already been opened under R03: https://github.com/nvs-vocabs/R03/issues/5

gmaze commented 1 year ago

Awesome decision ! Like in #56 I think this participates to the strategic objective of being able to create an Argo data model using webAPIs only It will be very useful for 3rd party APIs like argopy

vpaba commented 1 year ago

Thanks @gmaze. Is there a specific structure you would recommend for adding this information to the descriptions (e.g. valid min: 0.5f, valid max: 8.5f)? Or is it irrelevant what structure we use, as long as it's consistent? To note that not all parameters have a valid min/max associated with them in the original table.

gmaze commented 1 year ago

Yes, an attribute with a standard string providing how many digits should be displayed before and after the decimal point looks good. I would be curious to know which parameters have no valid min/max ... And may be the number of digits after the decimal point would further indicates the resolution of the parameter.

vpaba commented 1 year ago

Thanks @gmaze. The original list can be found on the ADMT website: http://www.argodatamgt.org/Documentation, under 'Argo data formats' in the linked Excel file named 'Argo physical parameters list: core-Argo and BGC-Argo, February 3rd 2023'. Columns H and I have valid_min and valid_max values against some of the parameters.

Good point about the digits after the decimal point as an indicator of the resolution - wondering if this is more sensor-dependent rather than an intrinsic attribute of the parameter? If so, we could move this information to the relevant sensor models in R27 maybe?

gmaze commented 1 year ago

I suspect the string format used in this table originates back to Fortran encoding or some legacy on the format, as pointed out by @apswong in #56 I don't know how to precisely interpret something like: "12000.f" for PRES. Is it 12000 before the decimal point and nothing after ? The PRES resolution is "1.f". For TEMP, the resolution is "0.001f" and valid min "-2.5f". The "f" is somehow confusing to me. May be @apswong or @tcarval you can recall what is the "f" for ?

wondering if this is more sensor-dependent rather than an intrinsic attribute of the parameter?

Good point, I guess it's determined by the sensor but can be modified in the encoding of the parameter

For core, all parameters have a "resolution" attribute:

        JULD:resolution = 0. ;
        JULD_LOCATION:resolution = 0. ;
        PRES:resolution = 1.f ;
        PRES_ADJUSTED:resolution = 1.f ;
        PRES_ADJUSTED_ERROR:resolution = 1.f ;
        TEMP:resolution = 0.001f ;
        TEMP_ADJUSTED:resolution = 0.001f ;
        TEMP_ADJUSTED_ERROR:resolution = 0.001f ;
        PSAL:resolution = 0.001f ;
        PSAL_ADJUSTED:resolution = 0.001f ;
        PSAL_ADJUSTED_ERROR:resolution = 0.001f ;

If resolution is added in R03, then the valid min and max values could be interpreted correctly, without ambiguity.

apswong commented 1 year ago

"%f" is a format specifier in C for floating point numbers. These control how the data will appear in standard output (e.g. ncdump, printf), e.g. "%7.1f" means the floating point number will appear in 7 characters, with 1 digit after the decimal point. In the Argo netCDF files, these are listed under a separate specification, e.g. PRES:C_format = %7.1f"

Having the "f" (not "%f") after the min/max specifications means they are floating point numbers (instead of integers). e.g. "12000.f" (not "%12000.f") means the floating point value of 12000; "8.5f" (not "%8.5f") means the floating point value of 8.5. I don't know the history behind using "f" in the min/max specifications. I hope @tcarval can tell us.

R03 should record the min/max values faithfully as they appear in the parameters xlsx, since they are used by the GDAC FileChecker.

The suggestion to add resolution to R03 is a good one, but it should be initiated by @catsch, who is in charge of the parameters xlsx. Once that's agreed by the ADMT and added to the parameters xlsx, then it can be added to R03.

Lastly, these min/max and resolution specifications are more parameter-dependent, but not necessarily sensor-dependent. Hence I would advise not to add them the sensors table R27.

gmaze commented 1 year ago

Thanks @apswong for the explanation !