OSOceanAcoustics / echopype

Enabling interoperability and scalability in ocean sonar data analysis
https://echopype.readthedocs.io/
Apache License 2.0
94 stars 73 forks source link

Create SONAR-netCDF4 compliance checker based on a CDL file #1043

Closed emiliom closed 1 year ago

emiliom commented 1 year ago

This will be a relatively simple compliance checker, tuned to the echopype implementation of SONAR-netCDF4 v1. We'll use it to manually assess compliance for every instrument or any given converted file or echodata object. It'll use a CDL file to test again expected variables, dimensions and attributes, including names, variable data types, attribute values and variable dimensions.

I'll create the CDL as follows:

  1. Identify an EK80 raw file with real and imaginary parts. Convert it to netcdf file.
  2. Use ncdump -h to generate a CDL of the metadata ("header"). Actually, I can skip creating the file and use nc4.Dataset.tocdl(), like this: nc4_ds.tocdl(coordvars=False, data=False, outfile=exported.cdl)
  3. Manually edit the CDL to include anything from the v1 convention that is missing or is incorrectly implemented.
  4. Add bare-minimum artificial data so that the CDL can be converted back into a netcdf file (can test it using ncgen)

The compliance checker will read the CDL into an xarray dataset as follows (see here for the xarray part):

import netCDF4 as nc4
import xarray as xr

nc4_ds = nc4.Dataset.fromcdl('exported.cdl', ncfilename=None, mode='a',format='NETCDF4')
# if the group parameter is not used, the Top-level (root) group is returned
xr_ds = xr.open_dataset(xr.backends.NetCDF4DataStore(nc4_ds, group='Platform'))
emiliom commented 1 year ago

Here are some specific tests I will build in.

For each netcdf group:

There are important, additional twists, though:

The tests to be implemented will then need to take into account the obligation and echopype_mods attributes.

emiliom commented 1 year ago

The echopype_checker package (https://github.com/OSOceanAcoustics/echopype-checker) still needs polish, but its github repo now includes decent instructions and notebooks illustrating its usage.

So, I'm closing this issue. Further package development will be tracked in its repo.