euroargodev / sensor_metadata_json

Tools, schemas and example files for metadata for ocean platforms and sensors
European Union Public License 1.2
4 stars 3 forks source link

SENSORS and PARAMETERS with suffix 2 and higher for repeated sensors #19

Open BrianKingNOC opened 1 year ago

BrianKingNOC commented 1 year ago

Below is a statement of the issue from the October 2023 'covering notes' for the activity.

I think I have found a solution that means we don't have to carry the suffix 2 in the strings in the JSON files. It can be added only when writing meta.nc. This means the float JSON file will only have SENSOR and PARAMETER names that match the relevant NVS tables. And if ADMT decide to change how the meta.nc and profile.nc files handle this issue, then we won't need any changes in the JSON files.

I will present my solution soon when I've got some code working.

Here is the statement of the issue:

A FLOAT JSON file will be made by aggregating a single PLATFORM file and as many SENSOR files as there are sensors attached to a platform. BAK has prepared a MATLAB tool that will do this and write out a new float JSON file. There is a schema for float JSON files that references the platform, sensor, and vendor-specific JSON schema files.

In its simplest form, the SENSOR array for a whole float can be made by appending the SENSOR arrays for each sensor. However, if there is more than one CTD sensor, or more than one OPTODE, then the Argo convention is to append a suffix to the variable names for second and subsequent sensors, so for an SBE41 followed by an RBR CTD the SENSORs might be CTD_PRES CTD_TEMP CTD_CNDC CTD_PRES2 CTD_TEMP2 CTD_CNDC2 TEMP_CTD_CNDC2

See that the RBR sensor names all have a suffix 2, even though there is no plain ‘TEMP_CTD_CNDC’ because the SBE41 doesn’t report that quantity.

Note that at present, table R25 only has ‘CTD_PRES’ and not ‘CTD_PRES2’ so the checker could only validate ‘TEMP_CTD_CNDC2’ by being smart enough to remove the final character ‘2’ and searching for ‘TEMP_CTD_CNDC’.

Likewise R03 contains ‘DOXY’ but not ‘DOXY2’ so a formal checker has to be smart enough to remove trailing suffix ‘2’ in parameter names as well as sensor names.

Removing trailing numeric characters will not always work, when the string terminates in a number, e.g., DOWN_IRRADIANCE412.

This question of how to handle multiple sensors measuring the same parameter does not arise in single sensor files, only when aggregated into a whole float file for passing into an Argo meta.nc file. ADMT might choose to review how this is done going forwards. The total number of floats affected is probably quite small in the entire Argo fleet. A small number of floats has two CTDs, a small number has two oxygen sensors.

Also we could choose to handle this differently in an aggregated float JSON file, and then map the repeat PARAMETER names into whatever ADMT decides for meta.nc and profile files.

SBS-EREHM commented 8 months ago

I believe your MATLAB FLOAT = PLATFORM + N x SENSOR aggregator handles adding suffixes where necessary.

I have left the current format checker alone, and support letting it complain about the suffixed SENSORS/SENSOR not being in the R25 vocab. That way the DAC can confirm: was the suffix intended or not? Otherwise, the checker has to have special case code for R25 items that already end in a number (RADIOMETER_DOWN_IRR412, BACKSCATTERINGMETER_BBP532, etc).