Open jcermauwedu opened 7 months ago
I'd be interested in contributing to this alongside #34
I'm a maintainer on IOOS Compliance Checker and would be happy to help with this.
Perhaps one focused sub-topic could be focussing on the NOS OFSs, which, as a rule are not CF compliant.
UGRID has recently been added to CF, as of 1.11.
I don't think the compliance checker(s) have kept up. There are a couple out there for UGRID, but I'm not sure of the status:
Thank you for all the feedback. I will continue to iterate on this proposal as additional feedback rolls in.
@jcermauwedu I'd ike to experiment with OG standards as a compliance-checker plugin to check its feasibility during the code sprint. If we can leverage the existing cc-glider-plugin it may be helpful, if not, maybe we can cross this idea out and move to the next one.
I think this should be fairly straightforward to clone the glider plugin to create a cc-og-plugin.
@callumrollo @ocefpaf As was mentioned today, there are three areas of interest: (1) OG plugin, (2) CF, (3) improving community engagement. We can bootstrap the OG plugin with one or more tests and prepare it for the eventual release of information later in June 2024. There is plenty to do.
Thank you for taking the time to propose this topic! From the Code Sprint topic survey, this has garnered a lot of interest.
Following the contributing guidelines on selecting a code sprint topic I have assigned this topic to @jcermauwedu . Unless indicated otherwise, the assignee will be responsible for identifying a plan for the code sprint topic, establishing a team, and taking the lead on executing said plan. The first action for the lead is to:
@MathewBiddle The code of conduct link on Contributing: Ground Rules gives a 404.
Webpage https://ioos.github.io/ioos-code-sprint/2024/topics/02-compliance-checker-topics.html
Thanks for the heads up on the Code of Conduct. We are discussing what an organization wide one should be in this issue https://github.com/ioos/.github/issues/10
I think this should be fairly straightforward to clone the glider plugin to create a cc-og-plugin.
I utilized some boilerplate framework from the glider and ugrid plugin to create the OG plugin. @ocefpaf : Would you create a IOOS new repo with the Apache 2 license? cc-plugin-og? I will copy the boilerplate code over to it. REF: https://github.com/uw-farlab/cc-plugin-og
The basic operation seems functional. It just needs to be populated with proper content.
$ compliance-checker -l
IOOS compliance checker available checker suites:
- OG:1.0
- UGRID:2.0
- acdd:1.1
- acdd:1.3
- cf:1.6
...
$ compliance-checker -t OG -D
====
OG
====
- check_basic_requirements
Check basic OG stated conventions.
* Format follows the CF 1.8 convention.
* Format follows the ACDD 1.3 convention.
* Variables are identified in capital letters.
* Attributes are identified in lower case.
$ pytest
=========================================================== test session starts ============================================================
platform linux -- Python 3.11.9, pytest-8.2.0, pluggy-1.5.0
rootdir: /home/portal/src/cc-plugin-og
configfile: pyproject.toml
plugins: flake8-1.1.1, requests-mock-1.12.1, time-machine-2.14.1
collected 1 item
tests/test_basicchecks.py . [100%]
============================================================ 1 passed in 0.16s =============================================================
@ocefpaf : Would you create a IOOS new repo with the Apache 2 license? cc-plugin-og? I will copy the boilerplate code over to it. REF: https://github.com/uw-farlab/cc-plugin-og
I don't have admin privileges to create repos but, while I do believe that we should move that to IOOS at some point, it is nice to keep it under your an account where you (we?) have more control. When the project is kind of mature we can move it to IOOS. What do you think @MathewBiddle?
I agree with @ocefpaf. Once ready, feel free to submit a "New IOOS Repository Request" using the issue form linked at https://github.com/ioos/governance/issues/new/choose
Perhaps one focused sub-topic could be focussing on the NOS OFSs, which, as a rule are not CF compliant.
- Run them through the compliance checker(s)
- And by hand
- Document what ways they are not compliant, and what needs to be done to bring them into compliance.
@ChrisBarker-NOAA @dpsnowden Is there URL to source some of these datasets for checking? Where should the feedback go?
@jcermauwedu: Yes please!
The OFSs are all served up via TDS servers and also on AWS.
The AWS ones are here:
https://noaa-nos-ofs-pds.s3.amazonaws.com/index.html
It would be nice to have a complete list, maybe a utility to download a set, or ...
Where might that go?
In the compliance checker repo?
Maybe a new repo specifically for OFS compliance?
Quite a hunting expedition to find an unstructured grid example. There is a ton of data out there. Finally located an example.
$ wget https://noaa-nos-ofs-pds.s3.amazonaws.com/sfbofs/netcdf/202405/nos.sfbofs.fields.n006.20240522.t03z.nc
$ ugrid-checker nos.sfbofs.fields.n006.20240522.t03z.nc
UGRID conformance checks complete.
List of checker messages :
*** FAIL R502 : Mesh data variable "u" has mesh="fvcom_mesh", which is not a variable in the dataset.
*** FAIL R509 : Mesh data variable "u" has dimensions ('time', 'siglay', 'nele'), of which 0 are mesh dimensions, instead of 1.
*** FAIL R502 : Mesh data variable "v" has mesh="fvcom_mesh", which is not a variable in the dataset.
*** FAIL R509 : Mesh data variable "v" has dimensions ('time', 'siglay', 'nele'), of which 0 are mesh dimensions, instead of 1.
*** FAIL R502 : Mesh data variable "tauc" has mesh="fvcom_mesh", which is not a variable in the dataset.
*** FAIL R509 : Mesh data variable "tauc" has dimensions ('time', 'nele'), of which 0 are mesh dimensions, instead of 1.
*** FAIL R502 : Mesh data variable "temp" has mesh="fvcom_mesh", which is not a variable in the dataset.
*** FAIL R509 : Mesh data variable "temp" has dimensions ('time', 'siglay', 'node'), of which 0 are mesh dimensions, instead of 1.
*** FAIL R502 : Mesh data variable "salinity" has mesh="fvcom_mesh", which is not a variable in the dataset.
*** FAIL R509 : Mesh data variable "salinity" has dimensions ('time', 'siglay', 'node'), of which 0 are mesh dimensions, instead of 1.
*** FAIL R502 : Mesh data variable "short_wave" has mesh="fvcom_mesh", which is not a variable in the dataset.
*** FAIL R509 : Mesh data variable "short_wave" has dimensions ('time', 'node'), of which 0 are mesh dimensions, instead of 1.
*** FAIL R502 : Mesh data variable "net_heat_flux" has mesh="fvcom_mesh", which is not a variable in the dataset.
*** FAIL R509 : Mesh data variable "net_heat_flux" has dimensions ('time', 'node'), of which 0 are mesh dimensions, instead of 1.
*** FAIL R502 : Mesh data variable "uwind_speed" has mesh="fvcom_mesh", which is not a variable in the dataset.
*** FAIL R509 : Mesh data variable "uwind_speed" has dimensions ('time', 'nele'), of which 0 are mesh dimensions, instead of 1.
*** FAIL R502 : Mesh data variable "vwind_speed" has mesh="fvcom_mesh", which is not a variable in the dataset.
*** FAIL R509 : Mesh data variable "vwind_speed" has dimensions ('time', 'nele'), of which 0 are mesh dimensions, instead of 1.
*** FAIL R502 : Mesh data variable "wet_nodes" has mesh="fvcom_mesh", which is not a variable in the dataset.
*** FAIL R509 : Mesh data variable "wet_nodes" has dimensions ('time', 'node'), of which 0 are mesh dimensions, instead of 1.
*** FAIL R502 : Mesh data variable "wet_cells" has mesh="fvcom_mesh", which is not a variable in the dataset.
*** FAIL R509 : Mesh data variable "wet_cells" has dimensions ('time', 'nele'), of which 0 are mesh dimensions, instead of 1.
*** FAIL R502 : Mesh data variable "wet_nodes_prev_int" has mesh="fvcom_mesh", which is not a variable in the dataset.
*** FAIL R509 : Mesh data variable "wet_nodes_prev_int" has dimensions ('time', 'node'), of which 0 are mesh dimensions, instead of 1.
*** FAIL R502 : Mesh data variable "wet_cells_prev_int" has mesh="fvcom_mesh", which is not a variable in the dataset.
*** FAIL R509 : Mesh data variable "wet_cells_prev_int" has dimensions ('time', 'nele'), of which 0 are mesh dimensions, instead of 1.
... WARN A903 : dataset has Conventions="CF-1.0", which does not contain a UGRID convention statement of the form "UGRID-<major>.<minor>".
Total of 27 problems logged :
26 Rxxx requirement failures
1 Axxx advisory recommendation warnings
Done.
Error codes and conformance documentation for the ugrid-checks
code: https://ugrid-conventions.github.io/ugrid-conventions/conformance/
REF: https://github.com/pp-mo/ugrid-checks
Compliance Checker UGRID:2.0 response:
$ compliance-checker -t UGRID:2.0 nos.sfbofs.fields.n006.20240522.t03z.nc
Running Compliance Checker on the datasets from: ['nos.sfbofs.fields.n006.20240522.t03z.nc']
--------------------------------------------------------------------------------
IOOS Compliance Checker Report
Version 5.1.2.dev30+gf543e4f
Report generated 2024-05-22T21:25:29Z
UGRID:2.0
https://github.com/ioos/cc-plugin-ugrid
--------------------------------------------------------------------------------
Corrective Actions
nos.sfbofs.fields.n006.20240522.t03z.nc has 1 potential issue
Highly Recommended
--------------------------------------------------------------------------------
Run UGRID checks if mesh variables are present in the data
* No mesh variables are detected in the data; all checks fail.
and lots of problems with it ;-)
If you want some smaller examples, you can use the OFS Subsetter:
SFBOFS and NGOFS2 are ugrids (and the great lakes ones, I think)
Not a good UI, but it works.
It looks like for those, the mesh/grid is not contained within the netCDF file. They are external in the "OFS_Grid_Datum" directory? Which is why the packages are not detecting a mesh variable?
$ head -5 sfbofs.2dm
MESH2D
MESHNAME SFBOFS
E3T 1 93 1 2 1
E3T 2 93 92 1 1
E3T 3 3 93 2 1
There was an initial plan to automatically call the CF tests if the OG format was called on for testing. Why not just enable it on the command line:
$ compliance-checker -v -t og:1.0 -t cf:1.8 ~/src/upstream/data/sea076_20230906T0852_R.nc
Running Compliance Checker on the datasets from: ['/home/portal/src/upstream/data/sea076_20230906T0852_R.nc']
Using cached standard name table v49 from /home/portal/.local/share/compliance-checker/cf-standard-name-table-test-49.xml
--------------------------------------------------------------------------------
IOOS Compliance Checker Report
Version 5.1.2.dev30+gf543e4f
Report generated 2024-05-23T01:16:17Z
cf:1.8
http://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html
--------------------------------------------------------------------------------
All tests passed!
--------------------------------------------------------------------------------
IOOS Compliance Checker Report
Version 5.1.2.dev30+gf543e4f
Report generated 2024-05-23T01:16:17Z
og:1.0
https://oceangliderscommunity.github.io/OG-format-user-manual/OG_Format.html
--------------------------------------------------------------------------------
Corrective Actions
sea076_20230906T0852_R.nc has 2 potential issues
Mandatory
--------------------------------------------------------------------------------
Check that all attribute names are lowercase.
* Global attribute Metadata_Conventions should be lowercase: metadata_conventions
* Variable CNDC attribute URI should be lowercase: uri
* Variable DOXY attribute URI should be lowercase: uri
* Variable PRES attribute URI should be lowercase: uri
* Variable PSAL attribute URI should be lowercase: uri
* Variable TEMP attribute URI should be lowercase: uri
* Variable DENSITY attribute URI should be lowercase: uri
* Variable TIME attribute URI should be lowercase: uri
* Variable TIME_GPS attribute URI should be lowercase: uri
Missing mandatory variables.
* Variable PLATFORM_SERIAL_NUMBER is missing
The work on CF-1.9 is not fully complete, but it is complete enough for use in testing the OG 1.0 requirements at the CF-1.9 and CF-1.10 level. The class just needs to be enabled in the compliance checker. It was mentioned that once the CF-1.9 work is completed, we can also enable it for CF-1.10 as there is not much difference between the two versions. UGRID will be included for CF-1.11.
The above test is now testing four distinct rulesets for Ocean Gliders 1.0:
“It looks like for those, the mesh/grid is not contained within the netCDF file.”
Something is off - by “those” do you mean from the OFS subsetter? They have always been complete for me.
I’ll try to get you one.
Something else to trudge through Thursday. Instrument representation appears to be different for OG 1.0 vs CF. The IOOS checker checks each variable for an instrument
attribute to attach it to an instrument or package with recording multiple variables (CTD). The OG 1.0 format does the reverse. There is a list of instruments mapped to the list of variables defined as such below. This has caused an issue for the CF checker. The use of the instrument
is first defined in the IOOS Glider DAC netCDF 2.0 format specification under the dimensionless container variable types.
global attribute:
string :instrument = "WET Labs {Sea-Bird WETLabs} ECO Puck Triplet BBFL2-IRB scattering fluorescence
sensor", "Oxygen optode 4831", "Unpumped CT sail CTD", "Seaglider M1 Glider data logger" ;
variables:
PARAMETER = "TEMP_CPU_CHLA", "FLUORESCENCE_CHLA", "DPHASE_DOXY",
"MOLAR_DOXY", "TPHASE_DOXY", "TEMP_DOXY", "OXYSAT_DOXY", "PRES",
"SIGMA_T", "CNDC", "TEMP", "PSAL", "LATITUDE_GPS", "LONGITUDE_GPS",
"GLIDER_ROLL", "LATITUDE", "GLIDER_PITCH", "LONGITUDE", "GLIDER_DEPTH" ;
PARAMETER_SENSOR =
"WET Labs {Sea-Bird WETLabs} ECO Puck Triplet BBFL2-IRB scattering fluorescence sensor",
"WET Labs {Sea-Bird WETLabs} ECO Puck Triplet BBFL2-IRB scattering fluorescence sensor",
"Oxygen optode 4831", "Oxygen optode 4831", "Oxygen optode 4831",
"Oxygen optode 4831", "Oxygen optode 4831", "Unpumped CT sail CTD",
"Unpumped CT sail CTD", "Unpumped CT sail CTD", "Unpumped CT sail CTD",
"Unpumped CT sail CTD", "Seaglider M1 Glider data logger",
"Seaglider M1 Glider data logger", "Seaglider M1 Glider data logger",
"Seaglider M1 Glider data logger", "Seaglider M1 Glider data logger",
"Seaglider M1 Glider data logger", "Seaglider M1 Glider data logger" ;
Project Description
At the IOOS DMAC, it was generally agreed that there could be work put into the IOOS Compliance Checker. Additional IOOS toolsets may also receive beneficial updates with related work.
General topics:
Standards
Test Suite
Solicitation of participation in creation of example datasets with application of the OG-1.0 data format. The published document as it stands.
The example datasets will also need to be assessed for interoperability issues with CF, ACDD and NCEI.
A personal goal for this project is to continue work on acoustic type datasets with focus on the OG-1 data format and resolve or create additional issues for the IOOS Compliance Checker.
As time permits, examine impacts on glider processing packages with utilization of the OG-1.0 data format.
Community Engagement
GOAL: Increase community involvement in this and other IOOS toolsets.
NOTE: These topics could also serve as templates to other IOOS toolsets.
Features
Expected Outcomes
Community Engagement
Code and Documentation
pytest
for IOOS Compliance Checker and pocean-core.Standards
OG-1.0 Examples
Skills required
It would be useful to have working knowledge of python and knowledge of the netCDF4, xarray and pytest packages.
Expertise
Novice
Topic Lead(s)
@jcermauwedu
It would be great to have co-leaders to share experiences with related issues.
Relevant links
Discussion and Issues
https://github.com/OceanGlidersCommunity/OG-format-user-manual/discussions/165 https://github.com/OceanGlidersCommunity/OG-format-user-manual/discussions/92 https://github.com/OceanGlidersCommunity/OG-format-user-manual/pull/172
Software
https://docs.python.org/3/ https://docs.xarray.dev/en/stable/ (https://unidata.github.io/netcdf4-python/) https://docs.pytest.org/en/8.0.x/ https://github.com/ioos/compliance-checker https://github.com/pyoceans/pocean-core https://github.com/ERDDAP/erddap
Example datasets and templates
https://www.ncei.noaa.gov/netcdf-templates https://github.com/ERDDAP/erddapTest http://test.opendap.org/