MITgcm / xmitgcm

Read MITgcm mds binary files into xarray
http://xmitgcm.readthedocs.io
MIT License
57 stars 66 forks source link

open_mdsdata files is unable to open files generated by the MITgcm KPP package. #49

Closed dhruvbalwada closed 7 years ago

dhruvbalwada commented 7 years ago

I am posting this issue because I am unable to open files generated by the KPP package in the MITgcm model. Here is an example of the code that will reproduce this error.

The data files are stored in - /swot/SUM05/dbalwada/channel_model_output/varying_res/20km/run_kpp_out_test on the Sverdrup server. If you don't have access to this then please let me know, and I can email the files to you.

The command that I used to open the files is - ds = open_mdsdataset(data_dir, delta_t=1200, iters = [2594160], prefix=['U'], ignore_unknown_vars=False, geometry='cartesian') when I open the U velocity files, the file opens.

However when I replace the prefix ds = open_mdsdataset(data_dir, delta_t=1200, iters = [2594160], prefix=['KPPhbl'], ignore_unknown_vars=False, geometry='cartesian')

I get the error `KeyErrorTraceback (most recent call last)

in () 1 ds = open_mdsdataset(data_dir, delta_t=1200, iters = [2594160], prefix=['KPPhbl'], ----> 2 ignore_unknown_vars=False, geometry='cartesian') /home/dbalwada/.conda/envs/dhruvenv/lib/python2.7/site-packages/xmitgcm-0.2.0-py2.7.egg/xmitgcm/mds_store.pyc in open_mdsdataset(data_dir, grid_dir, iters, prefix, read_grid, delta_t, ref_date, calendar, geometry, grid_vars_to_coords, swap_dims, endian, chunks, ignore_unknown_vars, default_dtype, nx, ny, nz, llc_method) 188 ignore_unknown_vars=ignore_unknown_vars, 189 default_dtype=default_dtype, --> 190 nx=nx, ny=ny, nz=nz, llc_method=llc_method) 191 ds = xr.Dataset.load_store(store) 192 /home/dbalwada/.conda/envs/dhruvenv/lib/python2.7/site-packages/xmitgcm-0.2.0-py2.7.egg/xmitgcm/mds_store.pyc in __init__(self, data_dir, grid_dir, iternum, delta_t, read_grid, file_prefixes, ref_date, calendar, geometry, endian, ignore_unknown_vars, default_dtype, nx, ny, nz, llc_method) 435 for p in prefixes: 436 # use a generator to loop through the variables in each file --> 437 for (vname, dims, data, attrs) in self.load_from_prefix(p, iternum): 438 # print(vname, dims, data.shape) 439 #Sizes of grid variables can vary between mitgcm versions. Check for /home/dbalwada/.conda/envs/dhruvenv/lib/python2.7/site-packages/xmitgcm-0.2.0-py2.7.egg/xmitgcm/mds_store.pyc in load_from_prefix(self, prefix, iternum) 530 else: 531 raise KeyError("Couln't find metadata for variable %s " --> 532 "and `ignore_unknown_vars`==False." % vname) 533 534 # maybe slice and squeeze the data KeyError: "Couln't find metadata for variable KPPhbl and `ignore_unknown_vars`==False."` Any help would be appreciated. Thank you. Dhruv
rabernat commented 7 years ago

The metadata is normally found in the ~data.diagnostics~ available_diagnostics.log file. Can you double check that KPPhbl appears in ~data.diagnostics~ available_diagnostics.log ?

(edit: the metadata is not the same as the .meta file...that alone doesn't contain enough info to build a variable.)

dhruvbalwada commented 7 years ago

No KPP related files are not mentioned in the data.diagnostics file.

The data.diagnostics files looks like -

diagnostics for diffusivity tensor

&diagnostics_list diag_mnc=.FALSE., frequency(1) = 31104000., filename(1) = 'surface_forcing', fields(1,1) = 'oceTAUX ', 'TFLUX ', & &DIAG_STATIS_PARMS & ~

It doesn't even have mention of the U,V,W, etc files in it.

rabernat commented 7 years ago

Sorry I wrote the wrong file name. I mean available_diagnostics.log

rabernat commented 7 years ago

(possibly relevant documentation: http://xmitgcm.readthedocs.io/en/latest/usage.html#expected-files)

dhruvbalwada commented 7 years ago

available_diagnostics.log is not even generated. The output of KPP is obtained by setting a flag in data.kpp and not data.diagnostics. As it is not a diagnostic variable, there is no available_diagnostics.log file being generated.

rabernat commented 7 years ago

Ok, I think I understand. The output you are getting from KPP bypasses the diagnostics package completely, similar to the standard dump and tave output.

There are two solutions:

As you can see if you follow that link, the metadata from the standard "state files" had to be hard coded.

rabernat commented 7 years ago

This is basically the same issue as #5

dhruvbalwada commented 7 years ago

Yeah, okay I see the problem now. I will try to add the metdata corresponding to the KPP related files. Thank you. Dhruv

rabernat commented 7 years ago

I'm not sure exactly how you will do that without at least looking at an available_diagnostics.log file. It's kind of impossible to know that the variables mean just from their names. I would try running for one timestep with diagnostics enabled to see an available_diagnostics.log file with KPP stuff in it.

If you go this route, please submit your changes as a PR.