hypertidy / ncmeta

Tidy NetCDF metadata
https://hypertidy.github.io/ncmeta/
11 stars 5 forks source link

Handle vectors of different lengths from RNetCDF::var.inq.nc #44

Closed mjwoods closed 2 years ago

mjwoods commented 2 years ago

Fixes https://github.com/hypertidy/ncmeta/issues/42

RNetCDF::var.inq.nc returns additional elements when the file format is netcdf4. Some of these elements are vectors, with lengths that may differ from the number of dimensions in a variable. For example, chunksizes either has one element per dimension, or it is NULL for contiguous storage. Also, recent NetCDF library versions provide a filter API that can apply multiple filters to each variable, and descriptions of the filters are returned in vectors filter_id and filter_params, whose lengths are unrelated to the variable's dimensions.

To handle vectors of different lengths in the output from nc_var, vector elements are stored as list columns within the tibble produced by nc_var.

Also, the tibble produced within nc_axes does not require all elements from RNetCDF::var.inq.nc, so selecting only those required avoids the problem of differing vector lengths.

mjwoods commented 2 years ago

Hi @mdsumner , the filter_id and filter_params items were added to the output of var.inq.nc back in Aug/2021. I didn't realise this would break ncmeta - sorry for that. The issue did not occur in my tests of ncmeta, which may not have included variables with filters. Would it be possible to add a test to ncmeta that uses a netcdf4 dataset and a variable written with the deflate and/or shuffle filters?

mjwoods commented 2 years ago

Although my solution for nc_var appears to work, it involves a specified set of elements from var.inq.nc that can be vectors. In the future, I may need to add other vector elements as new features are added to the NetCDF library. Perhaps a more future-proof solution for ncmeta would explicitly select the items you want from var.inq.nc, so that any unrecognised items would be ignored by ncmeta.

mdsumner commented 2 years ago

hey thanks very much! no problem, I was just trying to be complete - keeping everything but didn't do it right, and just not able to pursue in detail atm.

I do want the extra information - but realize now better to exclude it until I can focus on it properly.

PR very much appreciated 👌