euroargodev / argopy

A python library for Argo data beginners and experts
https://argopy.readthedocs.io
European Union Public License 1.2
190 stars 41 forks source link

Which data mode and QC flags to use for User modes for BGC variables ? #280

Closed gmaze closed 5 months ago

gmaze commented 1 year ago

To move on providing BGC variables with argopy, we need to determine the data mode and QC flags filter values for each of the 3 user modes.

For core parameters, this is documented in here: https://argopy.readthedocs.io/en/latest/user_mode.html#id2

After some talks at IMEV, @Sauzede came up with the following table that we need to agree on. Pinging the @euroargodev/bgc_argo_qc team and @catsch for comments please. We added ? near QC flags where we were not sure...

🏄 expert mode

Return all the Argo data, without any postprocessing, so no data mode and QC filtering is applied, like for core

🏊 standard mode

This mode simplifies the dataset, remove most of its jargon and return a priori good data. In standard mode, only good or probably good data are returned and includes real time data that have been validated automatically but not by a human expert.

BGC variable Data mode QC flags
BBP700 R [1?, 2?, 5, 8]
A [1, 2, 8]
D [1, 2, 8]
CHLA R not allowed -
A [1, 2, 5, 8]
D [1, 2, 5, 8]
DOXY R not allowed -
A [1, 2, 8]
D [1, 2, 8]
NITRATE R not allowed -
A [1, 2, 8]
D [1, 2, 8]
Ph R not allowed -
A never found -
D [1, 2, 8]
RADIOMETRY R [1, 2, 8?]
A never found -
D [1, 2, 8]

🚣 research

This mode simplifies the dataset to its heart, preserving only data of the highest quality for research studies, including studies sensitive to small pressure and salinity bias (e.g. calculations of global ocean heat content or mixed layer depth).

BGC variable Data mode QC flags
for all variables R not allowed -
A not allowed -
D [1, 8]
grgdll commented 1 year ago

Hi Guillame,

It's very exciting to have BGC Argo in ArgoPy!

For BBP700, data mode "R" should not be allowed, because we agreed that after the RTQC tests have been perfomed, the BBP700 should be adjusted using a slope of 1 and and offset of 0 (zero). So all RTQC BBP data should be adjusted.

However, until the (new) RTQC tests have been implemented at all DACs, there will be a significant number of profiles that remain non-adjusted.

So the decision for ArgoPy would be to:

  1. ignore the "transient" and act as if all DACs have implemented the new tests, with the risk that not all BBP700 data will appear in "standard mode"; or
  2. accept QC flags of 1 and 2 in data mode "R".

I'm inclined to support decision 1.

Best, grg

gmaze commented 1 year ago

Thanks @grgdll for your feedback ! I poke @Sauzede and @catsch for comments as well,

In this first table version @Sauzede indeed added ? for QC=[1,2] with BBP in Real time for 🏊 standard user mode because there a decision to be made for argopy

as you point out: it's either the "theoretical" (but safer) choice (1), or the pragmatic (2)

I think that the argopy spirit of the 🏊 standard user mode would go with (1)

the current data mode census for BBP700 is

download

we should have an idea of the fraction of real time BBP700 data that should be adjusted that are not and would be excluded

grgdll commented 1 year ago

Something else to consider is that I am not 100% sure that the A-mode BBP data are obtained using the latest RTQC tests that the community has agreed.

gmaze commented 1 year ago

Something else to consider is that I am not 100% sure that the A-mode BBP data are obtained using the latest RTQC tests that the community has agreed.

well, that is beyond argopy scope and we should not take this in consideration

HCBScienceProducts commented 1 year ago

Hi Guillaume,

great to have BGC data handling within argopy on the horizon.

Two comments: (1) on the selection for the 🏊 standard mode:

Would a generalized version like

BGC variable Data mode QC flags
for all variables R not allowed -
A [1, 2, 5, 8]
D [1, 2, 5, 8]

with exception for radiometry, where

BGC variable Data mode QC flags
Radiometry R [1, 2, 5, 8]

cover all your bases? (For some transition period, BBP700 R mode could be added to the exception, until a larger fraction of DACs have implemented the RTQC procedures.)

This way should also be easier to document and to convey to a user than a parameter-specific listing, shouldn't it?!

(2) on listings of parameter-specific rules:

Try to include the b-parameter CHLA_FLUORESCENCE in your thought process for the BGC extensions already now. As well as some more exotic ones like TURBIDITY, BBP532, BISULFIDE, or CDOM. Same with parameter duplicates (= replicate sensors), like DOXY and DOXY2, or BBP700 and BBP700_2.

(Side note: A relatively generic selection table like the suggested one for expert mode, above suggestion for standard mode, and the suggested one for research mode would ease their treatment, I'd presume.)

HCBScienceProducts commented 1 year ago

@gmaze another question:

Is there a filter for the profile DIRECTION in the different modes? E.g., some floats record DOXY on both ascending and descending profile for verification of (to-come) time response corrections. Most casual users will be familiar with ascending profiles only?

user mode DIRECTION
🏄 expert mode all ('A' and 'D')?
🏊 standard mode only 'A'??
🚣 research mode all ('A' and 'D')?
catsch commented 1 year ago

I like @HCBScienceProducts suggestion

BGC variable Data mode QC flags
for all variables R not allowed -
A [1, 2, 5, 8]
D [1, 2, 5, 8]

with exception for radiometry, where

BGC variable Data mode QC flags
Radiometry R [1, 2, 5, 8]

but is a transition phase is possible for BBP ? presently R, A, D with 1,2,5, 8 flags have been screened with different ways but they are usable

1- thanks to @Sauzede, I think that the very bad data have been QCed 3,4 (for the majority of BB700 in R mode (coriolis)) 2- I think @grgdll is right, I don't think all BBP700 profiles in A mode are following the new decided RTQC, but data have been screened 3- DM data should be also revisited in some way, but they have been screened

gmaze commented 1 year ago

thank you all for your feedbacks ! we can surely implement a procedure in argopy and revisit technical choices once a year, following progress by the ADMT/DACs in adopting recommended QC workflows.

If I understand well recommendations for argopy are to (temporarily) adopt the following choices:

🏄 expert mode

Return all the Argo data, without any postprocessing, so no data mode and QC filtering is applied, like for core

🏊 standard mode

This mode simplifies the dataset, remove most of its jargon and return a priori good data. In standard mode, only good or probably good data are returned and includes real time data that have been validated automatically but not by a human expert. The following table will be updated regularly to reflect the level of adoption and implementation by all the DACs of the BGC variable procedures.

BGC variable Data mode QC flags
for all variables but Radiometry R not allowed -
A [1, 2, 5, 8]
D [1, 2, 5, 8]
Radiometry R [1, 2, 5, 8]

🚣 research

This mode simplifies the dataset to its heart, preserving only data of the highest quality for research studies, including studies sensitive to small pressure and salinity bias (e.g. calculations of global ocean heat content or mixed layer depth).

BGC variable Data mode QC flags
for all variables R not allowed -
A not allowed -
D [1, 8]

@catsch @Sauzede @grgdll @HCBScienceProducts please 👍🏻 or 👎🏻

gmaze commented 1 year ago

@HCBScienceProducts

Try to include the b-parameter CHLA_FLUORESCENCE in your thought process for the BGC extensions already now...

yes, at some point we will need to consider the difference between synthetic and bio profile variables

HCBScienceProducts commented 1 year ago

Sorry, I probably wasn't as clear as I could:

github-actions[bot] commented 7 months ago

This issue was marked as staled automatically because it has not seen any activity in 90 days

gmaze commented 6 months ago

poke: @catsch @Sauzede @grgdll @HCBScienceProducts

Is the table above still ok this June 2024 ?

can you please 👍🏻 or 👎🏻, I'm about to work on this

catsch commented 6 months ago
BGC variable Data mode QC flags
for all variables R not allowed -
A not allowed -
D [1, 8]

I think a QC_FLAG = 5 is missing for CHLA_ADJUSTED_QC in D

gmaze commented 6 months ago

I think a QC_FLAG = 5 is missing for CHLA_ADJUSTED_QC in D

even for the "expert" mode ?

catsch commented 6 months ago

I thought that for the expert mode all data no matter the QC are returned, my comment is for the research mode CHLA_ADJUSTED in D with CHLA_ADJUSTED_QC =5, is considered as good data

gmaze commented 6 months ago

My mistake ! I meant "research" mode

HCBScienceProducts commented 6 months ago

I think a QC_FLAG = 5 is missing for CHLA_ADJUSTED_QC in D

To allow QC_FLAG = 5 in data mode D can be added for all parameters in argopy "research" mode, I would think.

gmaze commented 6 months ago

Also from @catsch comment, we can get rid of CDOM variables in standard & research mode

BGC variable Data mode QC flags
for all variables but Radiometry and CDOM R not allowed -
A [1, 2, 5, 8]
D [1, 2, 5, 8]
Radiometry R [1, 2, 5, 8]
CDOM -
gmaze commented 6 months ago

so, last version of the table is:

🏄 expert mode

Return all the Argo data, without any postprocessing, so no data mode and QC filtering is applied, like for core

🏊 standard mode

This mode simplifies the dataset, remove most of its jargon and return a priori good data. In standard mode, only good or probably good data are returned and includes real time data that have been validated automatically but not by a human expert. The following table will be updated regularly to reflect the level of adoption and implementation by all the DACs of the BGC variable procedures.

BGC variable Data mode QC flags
for all variables but Radiometry and CDOM R not allowed -
A [1, 2, 5, 8]
D [1, 2, 5, 8]
Radiometry R only [1, 2, 5, 8]
CDOM none

CDOM not included because not ready yet.

🚣 research

This mode simplifies the dataset to its heart, preserving only data of the highest quality for research studies, including studies sensitive to small pressure and salinity bias (e.g. calculations of global ocean heat content or mixed layer depth).

BGC variable Data mode QC flags
for all variables R not allowed -
A not allowed -
D [1, 5, 8]

CDOM and Radiometry not included because not ready yet.

catsch commented 6 months ago

🏄 expert mode

Return all the Argo data, without any postprocessing, so no data mode and QC filtering is applied, like for core

🏊 standard mode

This mode simplifies the dataset, remove most of its jargon and return a priori good data. In standard mode, only good or probably good data are returned and includes real time data that have been validated automatically but not by a human expert. The following table will be updated regularly to reflect the level of adoption and implementation by all the DACs of the BGC variable procedures.

BGC variable Data mode QC flags
Radiometry , BBP700 R [1, 2, 5, 8]
CDOM none
for all other variables R not allowed -
A [1, 2, 5, 8]
D [1, 2, 5, 8]

CDOM not included because not ready yet.

🚣 research

This mode simplifies the dataset to its heart, preserving only data of the highest quality for research studies, including studies sensitive to small pressure and salinity bias (e.g. calculations of global ocean heat content or mixed layer depth).

BGC variable Data mode QC flags
for all variables R not allowed -
A not allowed -
D [1, 5, 8]

CDOM not included because not ready yet.