User question: Why are detailed QC flags missing from certain datasets?

kbailey-noaa commented 1 year ago

~8/17/23 - Mike Crowley was looking for detailed flags in the SBU02 live data. "I’m wondering if the process is running, but not embedding the flags into the data as I can’t find them." Mike didn't find spike or flat line tests running for RU40, Ru39, UD_476.

Don: likely more of a timing issue of the running processes, as the data that you reported missing on that first glider below did eventually show up in erddap.

Leila: Per Mike’s selection of gliders, here is what I see on ERDDAP on any of these gliders’ datasets:

ru40-20230629T1430--- QARTOD >> tests result available. ru40-20230817T1522--- QARTOD >> tests result missing.

ru39-20230420T1636 --- QARTOD >> tests result available. ru39-20230817T1520 --- QARTOD >> tests result missing.

ud_476-20150917T1400 --- QARTOD >> tests result missing. ud_476-20180116T1500 --- QARTOD >> tests result available. ud_476-20180501T1640 --- QARTOD >> tests result available. ud_476-20181112T1258 --- QARTOD >> tests result available. ud_476-20220305T1558 --- QARTOD >> tests result available. ud_476-20220412T1700 --- QARTOD >> tests result available. ud_476-20230113T1454 --- QARTOD >> tests result available. ud_476-20230822T2018 --- QARTOD >> tests result missing.

SBU02-20230222T2013-delayed --- QARTOD >> tests result missing. SBU02-20230420T1743-delayed --- QARTOD >> tests result missing.

sbu02-20230222T2011 --- QARTOD >> tests result available only for pressure although it is listing the other main variables as QCed. sbu02-20230420T1741 --- QARTOD >> tests result available only for pressure although it is listing the other main variables as QCed. sbu02-20230803T1544 --- QARTOD >> tests result available. [This wasn’t available last week]

Mike: Seems like real time QC is not working, but after recovery, QC is applied?

(Please investigate and provide a response.)

benjwadams commented 1 year ago

I haven't yet looked at individual datasets in question, but the legacy behavior is not to run QARTOD if any user supplied QC is found. This behavior is going to go away, but can possibly explain some of these.

kbailey-noaa commented 1 year ago

@benjwadams This behavior shouldn't go away...but can you pls explain more what you mean by that? User supplied QC is sometimes more robust, and we (IOOS) have been careful to avoid clobbering or overwriting user QC with DAC QC, as a policy. I believe this point came up as we were developing IOOS Certification guidance. We don't want to end up discouraging providers to submit their data because of concerns their QC flags may get overwritten. QARTOD is considered the 'minimum level' of QC. If no QC, then it must be applied.

kerfoot commented 1 year ago

There's clearly some misunderstanding on this issue. I am happy to discuss the details of the behavior that Ben is referencing in order to provide a clear picture of how things operate now so that we all have a good understanding of the process before making any decisions going forward.

User supplied qc should never be overwritten. Local knowledge always trumps DAC knowledge.

However, in order to achieve the big picture goal of applying the results of qc to downstream release (i.e.: GTS), we have to implement a standard set of tests and reserve a standard list of variable names that these results will be stored in so that we those tests are available for for consideration in releasing. We can't, practically speaking, inspect every file from every submitted data set to determine whether they local knowledge QC was included and treat those files on a case by case basis. It's not feasible or scalable. We also cannot expect NDBC to examine each data set for the inclusion of different sets of flags and apply those flags before releasing to GTS. This is also not feasible or scalable. I would hope that we are all in agreement on this as well.

The behavior that Ben is referencing should go away and be replaced with an across the board application of QARTOD algorithms and storage of the results in the variables that are reserved for use by the DAC. QARTOD should not be the minimum standard. It should be the standard by which other QC algorithms are judged. If implemented correctly, with the appropriate values, QARTOD can greatly increase the confidence in our data sets. I don't see any reason why we should consider this a minimum level of QC.

The variables referenced in point 4 of this issue should only be served in an ERDDAP data set if they are included by the data provider. The issue that we have is that these 15 variables are included in all data sets, regardless of whether they are actually included in the files. This is due to an out-of-date process by which the datasets.xml file is generated and has been noted many times before. Inclusion of variables not actually included in the data set is misleading and confusing for end users, especially as it relates to QC. I would hope everyone is in agreement on this.

I recommend that we make it clear to our data providers that we are reserving a set of variable names for our QC tests that will be overwritten if included by the data providers. I have looked at all 1694 data sets that are currently in the DAC and, to this date, no data providers are currently using the variables on our restricted list. I can also state with confidence that the vast majority of our data providers do not perform their own QC. In fact, I can think of only 1 provider that does so. Most of them depend on the DAC for this. So I think this is a very reasonable request by the DAC that does not "discourage" data providers from providing their own QC and including in the files they submit. All variables in files submitted by our data providers will be available via the ERDDAP end point. I was going to provide that list from my temporary updated wiki that I was working on at ioosngdac, but I can no longer find the markdown documents after the repo merge. Hopefully, they are still there somewhere and have not been permanently deleted as a lot of work went into the creation of those documents.

Bottom line is that we are not, and have never, clobbered any data submitted by our data providers. This was a core recommendation that was adopted in the original design of the DAC over 13 years ago. We do, however, need to apply a minimum (for now) set of QC tests to all data sets (real-time or delayed) that we can use to address the bigger issue of ensuring the highest quality data is available for assimilation into the operational and hindcast forecasts. And we need to document it and be able to clearly explain and stand behind our process for doing so.

kbailey-noaa commented 1 year ago

@kerfoot Thanks very much for this explanation, and context. It sounds good to me! Is RPS aware of this way forward too?

Ah, semantics problem re: my use of minimum. Our Certification guidance indicates we require QARTOD at a minimum - meaning _at _least__ - not as in lowest standard, and if users want to add to those must-have QARTOD tests then fine. This was less about glider DAC and more about other in situ datasets. Regardless, I agree.

NDBC won't read detailed flags anyway, and we established the aggregate QC flag as a solution to get them to at least read summary flags. Issue #277 addresses this.

There is a lot of good content here for a QC plan. Where is the proposed QC plan being developed? I know there is a push for RPS to supply info on current QC processes, and additional details like these functional requirements and the high-level approach you outlined here are needed in our documentation as well. Is this the plan, in development? https://docs.google.com/document/d/1rh5niGdqbPB0eIkJn9CO5fZmeknuHLrxVL0qBpoOEa8/edit

kerfoot commented 1 year ago

@kbailey-noaa We have added a weekly standing technical GDAC call Thursday's at 10am. @benjwadams, @leilabbb and @kerfoot attended today and had a very thorough discussion on developing an overall qc plan. We are in general agreement, made significant headway and worked on making sure that the overall plan will be broken down into manageable tasks/issues that can are clearly defined with a tangible result. I just filed issue #278 and am in the process of filing 2 additional issues regarding the scheduling of rt and delayed mode data sets.

Although NDBC will only use the agg flags (#277), we certainly want to make sure that all of individual tests that resulted in the aggregate are also available to all downstream users in order to provide an audit trail.

The plan is to create the precisely defined & manageable issues, tag them as QC, organize them into a coherent overall plan/approach and then let @Acolohan and @DonaldMoretti decide how/if/who they get tasked.

benjwadams commented 1 year ago

Likely related to https://github.com/ioos/glider-dac/issues/279 for timely application of QC to datasets.

However, metadata for ERDDAP aggregations is taken from the latest .nc file in a directory, which could lead to missing variables for some datasets. We need an efficient way to take the union of all variables in each deployment folder for each profile without having to open them each time. This doesn't just apply to QC variables, as @kerfoot has mentioned that certain sensor packages, and hence variables may be enabled or disabled during the course of a deployment. This may merit a separate issue.

kerfoot commented 1 year ago

Following up on today's discussion....

This issue will be closed once we are able to address/close issues:

277
278
279

ioos / glider-dac

User question: Why are detailed QC flags missing from certain datasets? #274

277

278

279