bug: assuming all state variables have unlimited dimension

hkershaw-brown commented 6 months ago

Making a separate issue for this since Robin has hit this while working on MOM6-MARBLE. Comment originally from https://github.com/NCAR/DART/issues/519

Assumptions in direct_netcdf_mod.f90:

If there is an unlimited dimension, every variable has the unlimited dimension.
The unlimited dimension is time, and we are interested in the latest time slice in a file.

In the cesm 2.3 clm, there is an unlimited dimension cohorts which is not used by any of the state variables (and, kind of strangely, not used at all)

So this line:

https://github.com/NCAR/DART/blob/1b76f3afa5978469d6a119710bb638c1c077ae20/assimilation_code/modules/io/direct_netcdf_mod.f90#L865 num_dims - 1 = 0

Gustavo's compiler is picking up this 1:0 and erroring out.

intel 19.1.1.21 (running on Cheyenne) just merrily continues, but the counts are incorrect. The count is set to 1 rather than the length of the dimension.

For fill_inflation_restart it does not matter if the state is read in incorrectly, since we just fill the output with (mean and sd (e.g. 1.0, 0.6). Side note here is we don't need to be calling read_state in fill_inflation_restart.

For filter, which does need to read the state, I think the read does not happen correctly (count is not the size of the variable) .

On write, only 'time' (lower case) can be the unlimited dimension. (This is another assumption in dart. It is a weakness in how we deal with unlimited dimensions (https://github.com/NCAR/DART/issues/359#issuecomment-1187818595)) For dart created files, dart won't add an unlimited dimension unless its name is 'time'. So creating and writing a file with fill inflation restart gets you a file with the correct info.

All the assumptions above are a bit of a hack.

I put a fix on https://github.com/NCAR/DART/tree/fix-unlimited_dim-read which checks the variable for unlimited dimension before adjusting the counts reading and writing. It is a narrow solution (it really just fixes the case where there is an unused unlimited dimension).

I'm not sure if the cohorts dimension will be used in the state, if it is then improving the state read/write to cope with various (and multiple) unlimited dimensions gets bumped up the priority list.

Let me know if this makes sense or not.

Originally posted by @hkershaw-brown in https://github.com/NCAR/DART/issues/519#issuecomment-1644469352

nancycollins commented 6 months ago

we could add a couple of namelist settings to be used in the state i/o code.

one could list dimension names to be ignored - e.g. if a variable is 4d in netcdf but really only 3d are used, the 4th dim should be ignored.

the other could list dimension names which are unlimited, so when creating a netcdf file and reading/writing it does the right thing.

the default for both could be "time" to be backwards compatible, but overwriteable and extensible for issues like this.

we could add another flag to say what to do with unlimited dims. the default is read the last entry, but we could have an option to stack the unlimited dim as an additional dimension in the state array, or indicate which index in the unlimited dim should be used.

(i'm sure i made a comment about this before, but i can't find it to reference. maybe it was pre-github.)

nancycollins commented 6 months ago

For fill_inflation_restart it does not matter if the state is read in incorrectly, since we just fill the output with (mean and sd (e.g. 1.0, 0.6). Side note here is we don't need to be calling read_state in fill_inflation_restart.

i think you're right. it does need the model time - could it just call that instead? i see the ensemble_handle is already allocated - will all the member numbers be set up correctly by other routines?

hkershaw-brown commented 6 months ago

Robin's MARBLE MOM6

netcdf marbl_params {
dimensions:
        Layer = 65 ;
        Time = 1 ;
variables:
        int64 Time(Time) ;
        double autotroph_settings\(1\)%kDOP(Layer) ;
        double autotroph_settings\(1\)%kNH4(Layer) ;
}

NCAR / DART

bug: assuming all state variables have unlimited dimension #614