JLD2 for opening netCDF files

sethaxen commented 2 years ago

Inhttps://github.com/arviz-devs/InferenceObjects.jl/issues/3, @visr wrote:

Though there is https://github.com/JuliaIO/JLD2.jl which has quite a good subset of HDF5. It could be interesting to see how much is needed to use JLD2 to do (a subset of) netCDF 4 I/O.

Note that being able to read netCDF4 files written by Python ArviZ/xarray is already a much smaller subset than "any valid HDF5 file found out in the wild". And I would expect netCDF files written through JLD2 to be fine for xarray to read. But still, it would need testing. I spoke to one of the JLD2 devs on Slack a while back and I think they thought it was worth a shot for sure.

netCDF 4 files are HDF5 files but only use a subset of the HDF5 spec. Currently opening netCDF files with jldopen errors with: "... is not a JLD2 file". Would it be possible to make JLD2 open netCDF files?

JonasIsensee commented 2 years ago

Hi @sethaxen,

this is definitely on the horizon. I've already done most of the hard work. Check out #388 What's missing (for reading) is largely interpretation of netcdf specific Metadata. Editing is not really possible, but writing new files could be done with proper Metadata handling.

visr commented 2 years ago

Thats great! I tried out that branch but haven't been able to load netCDF-4 data so far.

Here is a simple example with one 2D array and two dimensions, created with xarray.

import xarray as xr
import numpy as np

data = np.arange(1,13).reshape(3,4)
dim1 = [2,4,6]
dim2 = ["a", "b", "c", "d"]
da = xr.DataArray(data, name="mydata", coords={"dim1":dim1,"dim2":dim2})
da.to_netcdf("simple.nc")

Using ncdump shows us:

netcdf simple {
dimensions:
    dim1 = 3 ;
    dim2 = 4 ;
variables:
    int dim1(dim1) ;
    string dim2(dim2) ;
    int mydata(dim1, dim2) ;
data:

 dim1 = 2, 4, 6 ;

 dim2 = "a", "b", "c", "d" ;

 mydata =
  1, 2, 3, 4,
  5, 6, 7, 8,
  9, 10, 11, 12 ;
}

Using h5dump show all the netCDF metadaa:

HDF5 "simple.nc" {
GROUP "/" {
   ATTRIBUTE "_NCProperties" {
      DATATYPE  H5T_STRING {
         STRSIZE 34;
         STRPAD H5T_STR_NULLTERM;
         CSET H5T_CSET_ASCII;
         CTYPE H5T_C_S1;
      }
      DATASPACE  SCALAR
      DATA {
      (0): "version=2,netcdf=4.8.1,hdf5=1.12.1"
      }
   }
   DATASET "dim1" {
      DATATYPE  H5T_STD_I32LE
      DATASPACE  SIMPLE { ( 3 ) / ( 3 ) }
      DATA {
      (0): 2, 4, 6
      }
      ATTRIBUTE "CLASS" {
         DATATYPE  H5T_STRING {
            STRSIZE 16;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
         DATASPACE  SCALAR
         DATA {
         (0): "DIMENSION_SCALE"
         }
      }
      ATTRIBUTE "NAME" {
         DATATYPE  H5T_STRING {
            STRSIZE 5;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
         DATASPACE  SCALAR
         DATA {
         (0): "dim1"
         }
      }
      ATTRIBUTE "REFERENCE_LIST" {
         DATATYPE  H5T_COMPOUND {
            H5T_REFERENCE { H5T_STD_REF_OBJECT } "dataset";
            H5T_STD_I32LE "dimension";
         }
         DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
         DATA {
         (0): {
               ,
               0
            }
         }
      }
      ATTRIBUTE "_Netcdf4Coordinates" {
         DATATYPE  H5T_STD_I32LE
         DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
         DATA {
         (0): 0
         }
      }
      ATTRIBUTE "_Netcdf4Dimid" {
         DATATYPE  H5T_STD_I32LE
         DATASPACE  SCALAR
         DATA {
         (0): 0
         }
      }
   }
   DATASET "dim2" {
      DATATYPE  H5T_STRING {
         STRSIZE H5T_VARIABLE;
         STRPAD H5T_STR_NULLTERM;
         CSET H5T_CSET_UTF8;
         CTYPE H5T_C_S1;
      }
      DATASPACE  SIMPLE { ( 4 ) / ( 4 ) }
      DATA {
      (0): "a", "b", "c", "d"
      }
      ATTRIBUTE "CLASS" {
         DATATYPE  H5T_STRING {
            STRSIZE 16;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
         DATASPACE  SCALAR
         DATA {
         (0): "DIMENSION_SCALE"
         }
      }
      ATTRIBUTE "NAME" {
         DATATYPE  H5T_STRING {
            STRSIZE 5;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
         DATASPACE  SCALAR
         DATA {
         (0): "dim2"
         }
      }
      ATTRIBUTE "REFERENCE_LIST" {
         DATATYPE  H5T_COMPOUND {
            H5T_REFERENCE { H5T_STD_REF_OBJECT } "dataset";
            H5T_STD_I32LE "dimension";
         }
         DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
         DATA {
         (0): {
               ,
               1
            }
         }
      }
      ATTRIBUTE "_Netcdf4Coordinates" {
         DATATYPE  H5T_STD_I32LE
         DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
         DATA {
         (0): 1
         }
      }
      ATTRIBUTE "_Netcdf4Dimid" {
         DATATYPE  H5T_STD_I32LE
         DATASPACE  SCALAR
         DATA {
         (0): 1
         }
      }
   }
   DATASET "mydata" {
      DATATYPE  H5T_STD_I32LE
      DATASPACE  SIMPLE { ( 3, 4 ) / ( 3, 4 ) }
      DATA {
      (0,0): 1, 2, 3, 4,
      (1,0): 5, 6, 7, 8,
      (2,0): 9, 10, 11, 12
      }
      ATTRIBUTE "DIMENSION_LIST" {
         DATATYPE  H5T_VLEN { H5T_REFERENCE { H5T_STD_REF_OBJECT }}
         DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
         DATA {
         (0): (), ()
         }
      }
      ATTRIBUTE "_Netcdf4Coordinates" {
         DATATYPE  H5T_STD_I32LE
         DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
         DATA {
         (0): 0, 1
         }
      }
   }
}
}

For more files see also Example netCDF-4 files.

Some files don't give an error but show no datasets, "test_echam_spectral-deflated.nc" says the data is invalid, and the sample file I created above gave an UndefRefError on show, and seemingly no data was loaded.

JonasIsensee commented 2 years ago

I'm willing to help with this (in particular the hdf5 side), but I'd need someone to take the lead here. What's needed, is to look at the netCDF format spec and figure out what we need to do to get this working.

visr commented 2 years ago

Though I think this is cool and interesting, I can't really commit to leading such an effort at this point. Before this issue I'd never even looked at the h5dump of a netCDF-4 file before. I understand that someone would need to read the format spec and try to implement the metadata rules for netCDF-4. If someone was to do this, I guess it should probably happen in a separate repo that depends on JLD2.jl.

I'm not sure what causes the read failures right now, but I guess in the h5dump above there are some things being used that are not in the supported subset of #388 yet, or otherwise exposing some issues in #388. Once they can be read as HDF5, perhaps it's not too much effort to start with a simple subset of netCDF-4 files and start implementing a netCDF API on top of those files.

JonasIsensee commented 2 years ago

I'm not sure what causes the read failures right now, but I guess in the h5dump above there are some things being used that are not in the supported subset of https://github.com/JuliaIO/JLD2.jl/pull/388 yet, or otherwise exposing some issues in https://github.com/JuliaIO/JLD2.jl/pull/388. Once they can be read as HDF5, perhaps it's not too much effort to start with a simple subset of netCDF-4 files and start implementing a netCDF API on top of those files.

I'll have a look at that bit.

JonasIsensee commented 2 years ago

Hi @visr,

with the latest commit to #388 it is possible to fully decode your "simple.nc" file. The datasets work and the metadata can be retrieved using the function JLD2.load_attributes(f, "dim1").

The example files you referenced

For more files see also Example netCDF-4 files. don't appear to be valid hdf5, so I can't test them. (At least hdf5 debugging tools can't open them ?)

Alexander-Barth commented 2 years ago

Thank you very much for your progress on this!

don't appear to be valid hdf5, so I can't test them. (At least hdf5 debugging tools can't open them ?)

Indeed, I just checked. They are "classical" NetCDF3 (not based on HDF5).

$ ncdump -k ECMWF_ERA-40_subset.nc 
classic

I tried this PR on all files generated by the test suite of NetCDF 4.8.1. This PR can already open 30 out 44 files which is quite outstanding! Here is a link to all 44 files if it is useful.

https://dox.ulg.ac.be/index.php/s/YY58VbcmOOpctW0

These are the 14 files which cannot be opened:

 "./ncdump/ref_tst_compounds2.nc"
 "./ncdump/ref_nc_test_netcdf4_4_0.nc"
 "./ncdump/ref_tst_compounds3.nc"
 "./ncdump/ref_tst_compounds4.nc"
 "./h5_test/ref_tst_compounds.nc"
 "./nc_test4/ref_tst_interops4.nc"
 "./nc_test4/ref_tst_xplatform2_2.nc"
 "./nc_test4/ref_tst_xplatform2_1.nc"
 "./nc_test4/ref_tst_compounds.nc"
 "./nc_test4/ref_tst_dims.nc"
 "./dap4_test/nctestfiles/test_enum_array.nc"
 "./dap4_test/nctestfiles/test_opaque_array.nc"
 "./dap4_test/nctestfiles/test_one_var.nc"
 "./dap4_test/nctestfiles/test_vlen8.nc"

As far as I can tell compound, opaque and enum types are very rarely used in NetCDF. But the NetCDF4 API creates sometimes "internal" compound objects in the HDF5 file which are not visible from the NetCDF API. This in an example of such file (also a problematic file):

https://dox.ulg.ac.be/index.php/s/QDDBZnhz7G09ku5

This document explains a bit which part of the HDF5 API is used in NetCDF4:

https://docs.unidata.ucar.edu/netcdf-c/current/file_format_specifications.html

In any case, thank you so much for this PR and your work on HDF5!

visr commented 2 years ago

Awesome work indeed! For me also most of the files I use in practice now load. For one of the files that didn't, I tried to get to the bottom of it by stripping it down more and more. In the end I had a tiny array similar to simple.nc that did not work if I wrote it through libnetcdf and libhdf5, but did work if I wrote it only using libhdf5. I did this using the engine option in xarray.

The ncdump of the file that didn't work looked like this:

netcdf d {
dimensions:
    x = 3 ;
    y = 4 ;
variables:
    double x(x) ;
        x:_FillValue = NaN ;
        x:standard_name = "longitude" ;
        x:long_name = "longitude" ;
        x:units = "degrees_east" ;
        x:axis = "X" ;
    double y(y) ;
        y:_FillValue = NaN ;
        y:standard_name = "latitude" ;
        y:long_name = "latitude" ;
        y:units = "degrees_north" ;
        y:axis = "Y" ;
    float TEMP(y, x) ;
        TEMP:_FillValue = NaNf ;
data:

 x = 3.56916666666667, 3.5775, 3.58583333333334 ;

 y = 51.775, 51.7666666666667, 51.7583333333333, 51.75 ;

 TEMP =
  7.248062, 7.248062, 7.248062,
  7.248062, 7.248062, 7.248062,
  7.248062, 7.248062, 7.248062,
  7.248062, 7.248062, 7.248062 ;
}

And the other one was identical except that TEMP was mentioned first. The h5dump shows a few more differences to the attributes, here are the two h5dumps in case you want to have a look:

d.txt d-h5netcdf.txt

JonasIsensee commented 2 years ago

About half of these:

 "./ncdump/ref_tst_compounds2.nc"
 "./ncdump/ref_nc_test_netcdf4_4_0.nc"
 "./ncdump/ref_tst_compounds3.nc"
 "./ncdump/ref_tst_compounds4.nc"
 "./h5_test/ref_tst_compounds.nc"
 "./nc_test4/ref_tst_interops4.nc"
 "./nc_test4/ref_tst_xplatform2_2.nc"
 "./nc_test4/ref_tst_xplatform2_1.nc"
 "./nc_test4/ref_tst_compounds.nc"
 "./nc_test4/ref_tst_dims.nc"
 "./dap4_test/nctestfiles/test_enum_array.nc"
 "./dap4_test/nctestfiles/test_opaque_array.nc"
 "./dap4_test/nctestfiles/test_one_var.nc"
 "./dap4_test/nctestfiles/test_vlen8.nc"

can now also be opened with the latest commits.

@visr: I don't intend to learn using netcdf myself. Please test if your file already happens to work now. If not, you are welcome to share the file or code to generate it.

One of the main features still missing is not specific to netcdf. It is chunking. In that case the array is not stored contiguously but in multiple smaller chunks sliced along one or multiple dimensions. Not all of them need to be present in the file.

visr commented 2 years ago

Ah sorry, I assumed it would be easy to generate the files from the text files I attached to the issue. For netcdf there is ncgen to do the inverse of ncdump. But I just tried with your latest commits, and they work now! And the two other example files now also pass. This is really great work.

visr commented 2 years ago

In case you're interested in chunking, it might be good to look at https://github.com/meggart/DiskArrays.jl. Zarr.jl and NetCDF.jl both use it for chunking.

There is also some older discussion in https://discourse.julialang.org/t/common-interface-for-chunked-arrays/26009, and a proposal for an interface, https://github.com/meggart/ChunkedArrayBase.jl. Though that's not in use right now.

sethaxen commented 2 years ago

Here's another example multi-dataset netCDF4: tmp.zip. The latest commits in that PR seem to be able to load the names of the groups and the names of the arrays in each group but not the arrays themselves:

julia> using JLD2

julia> f = JLD2.jldopen("tmp.nc")
┌ Warning: File likely not written by JLD2. Skipping header verification.
└ @ JLD2 ~/projects/JLD2.jl/src/file_header.jl:23
JLDFile /home/sethaxen/projects/JLD2.jl/tmp.nc (read-only)
 ├─📂 prior
 │  ├─🔢 chain
 │  ├─🔢 draw
 │  ├─🔢 zdim
 │  ├─🔢 xdim
 │  ├─🔢 z
 │  └─🔢 x
 ├─📂 posterior
 │  └─ ⋯ (6 more entries)
 └─📂 observed_data (2 entries)

julia> posterior = f["posterior"]
JLD2.Group
 ├─🔢 chain
 ├─🔢 draw
 ├─🔢 zdim
 ├─🔢 xdim
 ├─🔢 z
 └─🔢 x

julia> posterior["chain"]
ERROR: JLD2.UnsupportedFeatureException("")
Stacktrace:
 [1] load_dataset(f::JLD2.JLDFile{JLD2.MmapIO}, offset::JLD2.RelOffset)
   @ JLD2 ~/projects/JLD2.jl/src/datasets.jl:132
 [2] getindex(g::JLD2.Group{JLD2.JLDFile{JLD2.MmapIO}}, name::String)
   @ JLD2 ~/projects/JLD2.jl/src/groups.jl:109
 [3] top-level scope
   @ REPL[9]:1

Compare with the result when we use NCDatasets:

```julia julia> using NCDatasets julia> ds = NCDatasets.NCDataset("tmp.nc") NCDataset: tmp.nc Group: / Groups NCDataset: tmp.nc Group: posterior Dimensions chain = 4 draw = 100 zdim = 3 xdim = 5 Variables chain (4) Datatype: Int64 Dimensions: chain draw (100) Datatype: Int64 Dimensions: draw zdim (3) Datatype: String Dimensions: zdim z (3 × 100 × 4) Datatype: Float64 Dimensions: zdim × draw × chain Attributes: _FillValue = NaN xdim (5) Datatype: Int64 Dimensions: xdim x (5 × 100 × 4) Datatype: Float64 Dimensions: xdim × draw × chain Attributes: _FillValue = NaN Global attributes created_at = 2022-08-08T17:02:44.89 arviz_language = julia arviz_version = 0.6.0 NCDataset: tmp.nc Group: prior Dimensions chain = 2 draw = 100 zdim = 3 xdim = 5 Variables chain (2) Datatype: Int64 Dimensions: chain draw (100) Datatype: Int64 Dimensions: draw zdim (3) Datatype: String Dimensions: zdim z (3 × 100 × 2) Datatype: Float64 Dimensions: zdim × draw × chain Attributes: _FillValue = NaN xdim (5) Datatype: Int64 Dimensions: xdim x (5 × 100 × 2) Datatype: Float64 Dimensions: xdim × draw × chain Attributes: _FillValue = NaN Global attributes created_at = 2022-08-08T17:02:44.89 arviz_language = julia arviz_version = 0.6.0 NCDataset: tmp.nc Group: observed_data Dimensions ydim = 10 Variables ydim (10) Datatype: Int64 Dimensions: ydim y (10) Datatype: Float64 Dimensions: ydim Attributes: _FillValue = NaN Global attributes created_at = 2022-08-08T17:02:44.967 arviz_language = julia arviz_version = 0.6.0 ```

JonasIsensee commented 2 years ago

Here's another example multi-dataset netCDF4: tmp.zip.

in this case the issue is that this file uses two consecutive filter / compression steps. (gzip deflate, data element shuffling) This is not implemented and could be a bit harder to support than a few of the other missing features.

Message 3...
   Message ID (sequence number):                   0x000b `filter pipeline' (0)
   Dirty:                                          FALSE
   Message flags:                                  <C>
   Chunk number:                                   0
   Raw message data (offset, size) in chunk:       (78, 22) bytes
   Message Information:                           
      Number of filters:                           2/2
      Filter at position 0                        
         Filter identification:                    0x0002
         Filter name:                              NONE
         Flags:                                    0x0001
         Num CD values:                            1
            CD value 0                             8
      Filter at position 1                        
         Filter identification:                    0x0001
         Filter name:                              NONE
         Flags:                                    0x0001
         Num CD values:                            1
            CD value 0                             4

JonasIsensee commented 2 years ago

With the latest commit all of these

 "./ncdump/ref_tst_compounds2.nc"
 "./ncdump/ref_nc_test_netcdf4_4_0.nc"
 "./ncdump/ref_tst_compounds3.nc"
 "./ncdump/ref_tst_compounds4.nc"
 "./h5_test/ref_tst_compounds.nc"
 "./nc_test4/ref_tst_interops4.nc"
 "./nc_test4/ref_tst_xplatform2_2.nc"
 "./nc_test4/ref_tst_xplatform2_1.nc"
 "./nc_test4/ref_tst_compounds.nc"
 "./nc_test4/ref_tst_dims.nc"
 "./dap4_test/nctestfiles/test_enum_array.nc"
 "./dap4_test/nctestfiles/test_opaque_array.nc"
 "./dap4_test/nctestfiles/test_one_var.nc"
 "./dap4_test/nctestfiles/test_vlen8.nc"

can be decoded. There are some warnings here and there, but nothing major, as far as I can tell.

Filter concatenation (as described above) is not yet implemented but should be doable in the next few days.

At this point, I believe, it would be a good time for some to start working on a JLD2-based netCDF implementation. I'll happily answer questions for the JLD2 backend/interface part but I won't design the API.

JonasIsensee commented 2 years ago

@sethaxen : The latest commit now also makes it possible to read your tmp.nc file.

sethaxen commented 2 years ago

Indeed, it seems to load perfectly!

I don't know much about how netCDF 4 relates to HDF5, but I was a bit surprised by comparing the attributes ussing JLD2 vs NCDatasets:

julia> using JLD2, NCDatasets

julia> f = jldopen("tmp.nc");

julia> JLD2.load_attributes(f["posterior"], "z")
4-element Vector{Pair{Symbol}}:
 :_Netcdf4Coordinates => Int32[0, 1, 2]
          :_FillValue => [NaN]
      :DIMENSION_LIST => Vector{Any}[[[1, 2, 3, 4]], [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  91, 92, 93, 94, 95, 96, 97, 98, 99, 100]], [["a", "b", "c"]]]
       :_Netcdf4Dimid => 0

julia> ds = NCDataset("tmp.nc");

julia> ds.group["posterior"]["z"]
z (3 × 100 × 4)
  Datatype:    Float64
  Dimensions:  zdim × draw × chain
  Attributes:
   _FillValue           = NaN

NCDatasets shows the names of the dimensions of the array (the values of the dimensions are stored as their own variables) in the order in which they occur, while JLD2 shows the values of the dimensions, but in the opposite order in which they occur. I'm not certain how one would then map these values to a specific dimension name.

JonasIsensee commented 2 years ago

hm, I'm also not sure how to best map backwards (from ID to name) but at least the name datasets give their respective dimension id.

Also: HDF5 usually lists array sizes from slowest to fastest dimension. (sometimes adding element size as last / additional dimension). When loading "normal" data, JLD2 always just reverses that ordering, as is done here when loading f["posterior/z"]

sethaxen commented 2 years ago

at least the name datasets give their respective dimension id.

Are you referring to _Netcdf4Dimid? What can that be used for?

I found another netCDF file that errors on loading with JLD2: https://github.com/arviz-devs/arviz_example_data/blob/594a6ad/data/centered_eight.nc?raw=true:

julia> f = jldopen("centered_eight.nc")
JLDFile /home/sethaxen/projects/ArviZ.jl/deps/data/example_data/data/centered_eight.nc (read-only)
 ├─📂 posterior
 │  ├─🔢 chain
 │  ├─🔢 school
 │  ├─🔢 draw
 │  ├─🔢 theta
 │  ├─🔢 mu
 │  └─🔢 tau
 ├─📂 posterior_predictive
 │  └─ ⋯ (4 more entries)
 └─ ⋯ (3 more entries)

julia> f["posterior"]["theta"]
ERROR: UndefVarError: cio not defined
Stacktrace:
 [1] (::JLD2.var"#111#113"{JLD2.MmapIO})(#unused#::Int64)
   @ JLD2 ~/projects/JLD2.jl/src/compression.jl:15
 [2] iterate
   @ ./generator.jl:47 [inlined]
 [3] _collect(c::UnitRange{Int64}, itr::Base.Generator{UnitRange{Int64}, JLD2.var"#111#113"{JLD2.MmapIO}}, #unused#::Base.EltypeUnknown, isz::Base.HasShape{1})
   @ Base ./array.jl:744
 [4] collect_similar
   @ ./array.jl:653 [inlined]
 [5] map
   @ ./abstractarray.jl:2867 [inlined]
 [6] jlread(io::JLD2.MmapIO, #unused#::Type{JLD2.FilterPipeline})
   @ JLD2 ~/projects/JLD2.jl/src/compression.jl:6
 [7] load_dataset(f::JLD2.JLDFile{JLD2.MmapIO}, offset::JLD2.RelOffset)
   @ JLD2 ~/projects/JLD2.jl/src/datasets.jl:97
 [8] getindex(g::JLD2.Group{JLD2.JLDFile{JLD2.MmapIO}}, name::String)
   @ JLD2 ~/projects/JLD2.jl/src/groups.jl:109
 [9] top-level scope
   @ REPL[20]:1

JonasIsensee commented 2 years ago

Are you referring to _Netcdf4Dimid? What can that be used for?

Exactly! As far as I can tell, datasets with named dimensions have an attribute _Netcdf4Coordinates. E.g. :_Netcdf4Coordinates => Int32[1, 0] where the numbers correspond to _Netcdf4Dimid in the other datasets describing the named dimension.

AFAICT, there is no way to know a priori, what hdf5 datasets will be netcdf data and which will describe named dimension and such. By going through all of them and reading their metadata, it should be possible to match them up.

I found another netCDF file that errors on loading with JLD2: https://github.com/arviz-devs/arviz_example_data/blob/594a6ad/data/centered_eight.nc?raw=true:

it's fixed now

JonasIsensee commented 1 year ago

I just merged and tagged a release for the readhdf5 branch. The things discussed above should be available starting at version v0.4.24.

JuliaIO / JLD2.jl

JLD2 for opening netCDF files #406