NOAA-CSL / MELODIES-MONET

MELODIES MONET - diagnostic tool for evaluating models against a variety of observations including surface, aircraft, and satellite data all within a common framework
https://melodies-monet.readthedocs.io
Apache License 2.0
18 stars 20 forks source link

The current code is creating lat and lon twice for cesm_fv #261

Open blychs opened 2 weeks ago

blychs commented 2 weeks ago

EDIT: Just in case, I wanted to make clear that all of this was tested in the develop branch

All of this issue stems from PR #259 and the discussion that followed. Here is the description of the problem and some possible solutions.

I am failing to run the docs/example/camchem.ipynb notebook. Here is an (abbreviated) traceback;

~/melodies-monet/MELODIES-MONET/melodies_monet/driver.py in ?(self, time_interval)
   1025                             model_obj = model_obj.isel(z=0).expand_dims('z',axis=1)
   1026                     except KeyError as e:
   1027                         raise Exception("MONET requires an altitude dimension named 'z'") from e
   1028                     # now combine obs with
-> 1029                     paired_data = model_obj.monet.combine_point(obs.obj, radius_of_influence=mod.radius_of_influence, suffix=mod.label)
   1030                     if self.debug:
   1031                         print('After pairing: ', paired_data)
   1032                     # this outputs as a pandas dataframe.  Convert this to xarray obj

...

ValueError: the new name 'longitude' conflicts

So, @rschwant or @zmoon , here is the problem and a few possible solutions. Let me know which one I should go with, and I will do it ASAP.

PROBLEM

The problem is that when the monetio _cesm_fv_mm.py opens the file, it creates the coordinates latitude and longitude from variables lat and lon of camchem, but it does not delete the original variables (lat and lon). Later, when MM performs the pairing, it requires the monet function combine_point from monet_accessor.py

This in turn calls _dataset_to_monet (also in monet_accessor.py), which calls _rename_to_monet_latlon().

There is where the problem lies. In line 163 of monet_accessor.py, it performs a check which is somewhat too simple.

148 def _rename_to_monet_latlon(ds):
 149     """Rename latitude/longitude variants to ``lat``/``lon``,
 150     returning a new xarray object.
 151
 152     Parameters
 153     ----------
 154     ds : xarray.DataArray or xarray.Dataset
 155     """
 156
 157     # To consider unstructured grid
 158     if ds.attrs.get("mio_has_unstructured_grid", False):
 159         check_list = ds.data_vars
 160     else:
 161         check_list = ds.coords
 162
 163     if "lat" in check_list:
 164         display(ds) # debug
 165         return ds.rename({"lat": "latitude", "lon": "longitude"})
 166     elif "Latitude" in check_list:
 167         return ds.rename({"Latitude": "latitude", "Longitude": "longitude"})
 168     elif "Lat" in check_list:
 169         return ds.rename({"Lat": "latitude", "Lon": "longitude"})
 170     elif "XLAT_M" in check_list:
 171         return ds.rename({"XLAT_M": "latitude", "XLONG_M": "longitude"})
 172     elif "XLAT" in check_list:
 173         return ds.rename({"XLAT": "latitude", "XLONG": "longitude"})
 174     else:
 175         return ds

As you can see, this assumes that if the variable lat is there, the variables longitude and latitude are not.

Possible solutions

First, we need to decide whether we want to deal with this at the MELODIES-MONET level, at the monetio level or at the monet level.

If we want to do it here (MM), I can simply check to see if the variables lat and latitude, lon and longitude, are there at the same time. If they are, we could print a warning and drop lat and lon.

If we want to do it at the monetio level, I could just get rid of lat and lon after creating latitude and longitude in _cesm_fv_mm.py, which would result in similar behaviour the the one present in the other models.

If we want to do it at the monet level, I could edit _rename_to_monet_latlon to check first if latitude and longitude exist, rather than directly attempting to rename it.

Of these solutions, I tried the one in MM, which works with the expected results

It might be a good idea to do the three of them.

Let me know what you think.

blychs commented 2 weeks ago

@dwfncar @rrbuchholz , I have already a version of the _cesm_fv_mm.py code that works in my fork of monetio. I'll wait until you give it a look before submitting a PR.

zmoon commented 1 week ago

@blychs I think dropping (or renaming) lat/lon is the correct approach.

I can review when you submit PR to monetio, but probably also

    dset.coords["longitude"] = (("y", "x"), lons)
    dset.coords["latitude"] = (("y", "x"), lats)

should be just

    dset["longitude"] = (("y", "x"), lons)
    dset["latitude"] = (("y", "x"), lats)

(or use dset = dset.assign(longitude=..., latitude=...)).