noaa-oar-arl / monetio

The Model and ObservatioN Evaluation Tool I/O package
https://monetio.readthedocs.io
MIT License
16 stars 19 forks source link

hysplit.open_dataset failing when cdump outputs have same sampling start time #149

Open amcz opened 7 months ago

amcz commented 7 months ago

This is a bug which causes the loading of cdump information into the xarray dataset to fail under a rare use case. It affects the open_dataset and combine_dataset functions in hysplit module

The cdump file is structured so that the records indicate the sampling start and sampling end time. The open_dataset function utilizes one of these as the value of the time coordinate in the xarray DataArray that it uses to store the cdump information in. By default it uses the sampling start time.

Usually this works well. However there can be cases under which this fails.

As an example: A file was created in which the model was initialized with a pardump file and the sampling start time was set before the simulation start time.

The first model output had a sampling start time = sampling end time = simulation start time = 12:00 . The second model output had sampling start time = 12:00 and sampling end time = 13:00.

This resulted in two records with the same time coordinate and z values. Thus the merge of the datasets from these records failed.

This is a rare use case and would usually result from model setup that is not optimal. For instance, the first output from the cdump file with the sampling from 12:00 to 12:00 is probably unwanted and unintended and the CONTROL file should be changed to set the sampling start to the simulation start which would prevent this record from being entered into the cdupm file.

However, the open_dataset function should be able to handle or at least produce a cogent warning for such an occurrence.