Closed kerfoot closed 1 year ago
1) Yes
2) Docker logs on the glider_qartod
logs show this info.
3) https://github.com/ioos/glider-dac/blob/master/data/qc_config.yml
4) Currently gross range, flat line, rate of change, spike. Aggregate flag too if that is counted.
5) Do any geophysical variables have linked ancillary variables with standard names ending in quality_flag
or status_flag
?
6) Incomplete docs
7) Currently, there is a user.qc_run
Linux xattr. However, it tells whether QC has been run by us or detected in the file.
8) TBD, but I don't think aggregate flags are run on a variable if QC vars are detected, the reason being that some institutions may have certain criteria in determining rollup/aggregate flag aside from taking the highest level of failure for each flag position within the array.
9) It looks like it has since been QCed, jobs are run on a queue so QC is not always run on time.
Regarding answers to ioos/ioosngdac#1, ioos/ioosngdac#4, ioos/ioosngdac#5, ioos/ioosngdac#8: There are no QC flags on this real-time dataset: https://gliders.ioos.us/erddap/tabledap/electa-20230523T1947.html and there are no user-supplied qc variables on the submitted NetCDFs in /data/submission/rutgers/electa-20230523T1947 and there are no geophysical variables that have an ancillary_variables containing quality_flag or status_flag. The dataset XML element appears to have added some _qc variables (i.e.: density_qc, temperature_qc, etc.), but the arrays are all _FillValues and there are no standard_names.
Added documentation and proposed process for finding files that need to be qc'd:
https://github.com/ioos/ioosngdac/wiki/Internal-DAC-Administration-Space#proposed-qc-process
Wrote a shell script to create the list of data provider submitted NetCDF files that need to be QC'd. The script can be found in:
/home/glider/qc/bin/build_deployment_qc_queue.sh
The script searches all active real-time data sets in:
/data/data/priv_erddap
and creates a list of the files that need DAC supplied QC applied to them. These files are located in:
/home/glider/qc/queue
The script is run as user glider
In testing, the script is currently monitoring 34 active real-time data sets. These data sets are processed (the file queue lists created) in under 60 seconds.
These files can be used as inputs to the ioos_qc processing pipeline. Depending on the performance of ioos_qc, this should allow us to significantly increase the qc application frequency.
Additional documentation is available here
Closed as OBE. Will be refiled as a new, more focused issue.
Questions: