Unidata / netcdf-c

Official GitHub repository for netCDF-C libraries and utilities.
BSD 3-Clause "New" or "Revised" License
511 stars 263 forks source link

Cleanup of struct NC_Dispatch #1056

Open DennisHeimbigner opened 6 years ago

DennisHeimbigner commented 6 years ago

Proposed changes/cleanup of the NC_Dispatch structure

edhartnett commented 6 years ago

These are all good, except the last one.

There are lots of cases for user defined formats, where the user defines only the get/put_vara functions, and leaves the vars functions to the NCDEFAULT version, which just calls the vara function over and over again to make up the vars call. In fact, this is how the HDF4 layer works, and also the AB Dispatch layer (part of a separate library, the first of the external user-defined formats).

So the vara functions are very useful in the dispatch layer, IMO. They will be easier to write for user-defined formats than the vars functions, for most formats.

DennisHeimbigner commented 6 years ago

Good point about var/vars. We might also want to consider some more structured way to pass a void* parameter; perhaps something like the HDF5 properties mechanism.

edhartnett commented 6 years ago

There is a fundamental difference between netcdf properties and HDF5 property lists, which is that HDF5 property lists are exposed to the cold, brutal hands of the user, whereas the netcdf properties will only be touched by dispatch-layer netcdf programmers, never the end user.

Right now we have void *parameters in open and create. According to your proposed changes, we would also have them in a few other places. These amount to defacto properties lists for these functions.

How about defined structs?

So for example a struct NC_OPEN_PROPERTIES, which contained all the fields that would otherwise end up in a void *parameter in the dispatch nc_open().

New parameters can just be added to the struct without breaking any existing code.

edhartnett commented 6 years ago

Seems like nc_inq_format_extended() could be moved out of the dispatch table and implemented in libdispatch in a way that would work for all dispatch layers.

edhartnett commented 6 years ago

I suggest we make all dispatch functions that take ncid as a parameter instead get a pointer to the NC object for the file.

Currently we look up the file 3 times for a call to nc_put_vara_int().

1 - We look it up (and don't use it) in nc_put_vara_int() (and this lookup is removed in a pending PR I have up.) 2 - We look it up in NC_put_vara() (dvarput.c) so we can find the correct dispatch layer. 3 - We look it up in NC4_put_vars() (hdf5var.c) when the dispatch layer is called.

The reason is that we always pass ncid, but we already have to look up NC from ncid in order to identify the dispatch layer. So we should just pass *NC (which contains the ncid), instead of passing the ncid and looking up NC again.

As we have seen, some users open thousands of files at a time. So looking up the file twice for each operation is something we should avoid.

Unfortunately, this is a big change that will touch many, many code files at once.

DennisHeimbigner commented 6 years ago

I remember that I did this for a reason, but I no longer remember the reason. It may have had something to do with recursive calls into the API. Libdap2 code uses the netcdf-3 API and does not have the dispatch table.

Let me see if I can recall the rationale.

edhartnett commented 6 years ago

I ran into the recursion problem when working on PIO and netCDF. The answer is to call the nc3_* version of the functions, and not invoke the dispatch layer at all.

With the changes I am proposing, the libdap2 code should look up the NC and call the function in libsrc directly (passing pointer to NC and not ncid). The libdap2 code is only every going to want the classic version of the function, so there is no need to use the dispatch layer.

No doubt that is some effort but I don't think that looking up every file twice, for every operation, is a good solution. I would be happy to make the changes, under your supervision, if that would be helpful.