nvs-vocabs / P01

Repository to manage issues related to the BODC P01 Vocabulary
5 stars 0 forks source link

Missing P01 codes for nutrients parameters like P04-P, NO3-N, NO2-N, NH4-N, SiO3-Si and H2S-S? (BODCNVS-1518) #166

Open HjalteParner opened 2 years ago

HjalteParner commented 2 years ago

When reporting nutrients like as an example nitrate (NO3) (to at least ICES) most people tend to report nitrate nitrogen (NO3-N) and then use the P01 code https://vocab.nerc.ac.uk/collection/P01/current/NTRAZZXX at the same time!

However looking at the NTRAZZXX P01 code label, definition and the CAS number/link for that matter, it seems like this code just is nitrate and not nitrate nitrogen. It does have a relation to the P09 code https://vocab.nerc.ac.uk/collection/P09/current/NTRA/ which indeed specify that its Nitrate Nitrogen. However I do not find an equal P01 code!

To my knowledge it's widely use to both store and report nitrate as nitrate nitrogen. If reporting is done in some of some kind of moles unit like the common umol/l, then the issue is not that big as the value will be the same. However if reporting is done in some kind of weight units, then the issue will be tremendous, as nitrogen only makes up 22.6 percent on the nitrate ion, and as another example when further combining nitrate, nitrite and ammonium into DIN, it gets catastrophic, and not only for nitrate as exemplified above.

https://www.ices.dk/data/tools/Pages/Unit-conversions.aspx

You could say that then people should just convert their nitrate nitrogen values into nitrate when reporting and then use the existing P01 code (or one of them) when reporting. However this might very well lead to misunderstandings i.e. I just have one recent example from one of our submitters i.e. //SDN:LOCAL:NO3-NICES:P01::NTRAZZXXICES:P06::UGPL where it's clearly a mistake using the NTRAZZXX P01 code and at the same time the unit is ug/l. However a machine won't catch this reporting error as 'NO3-N' just is a label which is the whole purpose of mapping it into the P01 code in the first place.

I think not having the highlighted nutrients as P01 codes is a serious issue that's already proven leading to misunderstandings with potential serious impact.

What do you recon is the solution to prevent such mistakes?

Br. Hjalte

roy-lowry commented 2 years ago

This is a problem that I was trying to address for 20 years. It is the result of a collision between the research community who deal with nutrients as molar quantities and the monitoring community who deal with nutrients as elemental masses. My view has always been that the only safe way to deal with the situation is to standardise with one convention and as BODC were primarily dealing with the research community P01 was set up for molar quantities. I still feel strongly that this is the correct solution. If we add extra P01 codes to cover nutrients as elemental masses we might possibly prevent errors by the monitoring community who don't standardise but in the process we create a massive potential for error in aggregation software unless it has been specifically programmed to recognise the two conventions and their semantic descriptions. I sincerely believe that education of data originators is the way to solve this problem. Cheers, Roy.

HjalteParner commented 2 years ago

Good to hear from you, Roy!

I fully appreciate and acknowledge your comment and agree that the safest and most reliable and simple way to deal with this issue, would be for everybody to use molar units.

I guess what you are suggesting indirectly is to force molar units when reporting these parameters?

If that's the case then it might be viable to get this into the description of the P01 codes to prevent misunderstandings?

To my knowledge it's not forced in the SeaDataNet / EMODnet community either?

...an alternative option would then again be to relabel all these P01 codes as Nitrate Nitrogen, Phosphate Phosphorus etc., in which case it wouldn't matter if submitters use molar or weight units, as it will be straight forward to convert weight into molar units. ICES DOME is actual doing this. However I probably agree that this solution might be more counter intuitive than forcing molar units - even though I'm mainly for freedom most times.

roy-lowry commented 2 years ago

Hi Hjalte,

Yes I'm still lurking in the background.

Thinking overnight I recalled a cloudy memory of this issue arising whilst the EMODNET nutrient products were being generated using automated aggregation of SeaDataNet data using ODV. Data were coming in tagged with a code of NTRAZZXX and units of ug/l, but the data values were nitrate expressed as nitrogen. I vaguely remember a lot of discussion between Reiner and various people as to how these could be automatically translated into the units required for aggregation (umol/l). I think the conclusion was to put pressure on the originators to use molar quantities, but not sure if this got as strong as 'forced'.

I agree the current definitions of nutrient parameters could be improved to make it clearer - maybe replace 'Nitrate may be expressed in terms of mass or quantity of substance.' by 'Nitrate may be expressed in terms of mass (of nitrate, not nitrogen) or quantity of substance. Advice is to standardise mass concentration of nitrate expressed as nitrogen into the molar equivalent.'

Let's see what @gwemon thinks.

mlipizer commented 2 years ago

Dear all, I share Hjalte's view that "it's widely use to report nitrate as nitrate nitrogen", regardless if it is expressed as umol/l or as mass. If I am not wrong, P09 was used "before" P01 (but I started working on this when P01 was in already in place), and P09 correctly indicates <NTRA | NITRATE (NO3-N) CONTENT (as for phosphate: PHOS PHOSPHATE (PO4-P) CONTENT; SLCA SILICATE (SIO4-SI) CONTENT). Conversely, P01 seems to have lost this information as all P01 related to, i.e., nitrate contain only: nitrate {NO3- CAS 14797-55-8} and not N-NO3.

According to my experience, nitrate content represents Nitrogen-from NO3. However, the current P01 preflabel does not provide this information.

I confirm that EMODnet proposes "preferred units" in order to collect harmonized data. Kind regards Marina

-- | --

roy-lowry commented 2 years ago

Dear Marina,

The view that 'nitrate is reported as nitrate nitrogen' very much depends upon the community in which one works. My data management experience was based in international projects like JGOFS and WOCE where this practice was regarded as abhorrent. I was involved in some very heated meetings on the subject in Plymouth in the late 80s between 'old guard monitoring' and 'new scientific management' (all involved shall remain anonymous!). Suffice to say all PML's nutrient data holdings were converted from mg/l to umol/l.

To be historically accurate, the origins of P01 were developed in the UK during the late 1970s, which preceded P09 that was developed by Catherine Maillard for the MEDATLAS community a decade later. There was a meeting in the late 1980s in Dublin where I proposed harmonising both vocabularies, but this was rejected because BODC codes were 8-bytes long whereas MEDATLAS codes (basis of P09) were only 4-bytes long and the fixed data formats couldn't accept variable-length parameter codes. What a missed opportunity.....

So, we have our current problem exacerbated by two things. First, there is a lack of rigour in P01 compared to CF, which separates mole concentration from mass concentration. Secondly, in my view quite rightly, the developers of metadata best practice condemned the incorporation of semantics into units of measure such as mg N/l.

Given the realities of our current situation I think the best we can do is clarify P01 descriptions (NOT preflabel as that would cause chaos from a tide of misunderstanding in the research community) and do everything we can to get nutrient "preferred units" set to molar quantities such as umol/l. Please can EMODNET help here. That way the difference between 'nitrate' and 'nitrogen in nitrate' becomes an issue of semantic convention rather than a real difference in data values that will inevitably cause future errors in data aggregation.

Cheers, Roy.

gwemon commented 2 years ago

Thank you @HjalteParner @roy-lowry @mlipizer for your comments and sorry for the long silence. It's been on the back of my mind for some time and this is not an easily resolved issue! I completely agree with @roy-lowry that there is a very high risk on introducing concepts such as "P04-P, NO3-N, NO2-N, NH4-N, SiO3-Si and H2S-S" in S27 in order to build new P01. We have a precedent of identifying the element within a molecule or the molecule measured to quantify an element for some chemical substances in S27 e.g. for the organotins (e.g. expressed as cations and expressed as tin) or for elements related to the mineral composition of geological samples (http://vocab.nerc.ac.uk/collection/S27/current/CS004360/) using the expression also used in CF standard names "expressed as" but the decision was driven by the fact that there would be ambiguity even for the human brain in what the result value actually represented if we just said "dibutyltin". We also took that route recently when creating codes for model output CO2 expressed as carbon however it is not a practice I am very comfortable with and I would welcome alternative ways to manage these kinds of distinction. That being said I agree that the P01 codes are currently ambiguous and we need to find a way to address this ambiguity going forward particularly for machine-actionable operations I think. I will take time to review the issues, your suggestions and continue consulting broadly before doing this. Please continue commenting if you have questions, suggestions, views, preferences related to this. Many thanks.