spencerahill / aospy

Python package for automated analysis and management of gridded climate data
Apache License 2.0
82 stars 11 forks source link

Data with coordinate variables (lat, lon, etc.) with names not included in internal_names #293

Closed jdossgollin closed 5 years ago

jdossgollin commented 5 years ago

This may be two separate questions, but longitudes are often treated differently across models. Specifically: 1) Is there a way to define alt_names for a coordinate so that we can deal with something which is X, lon, or longitude? 2) Is there a way to join longitudes that go from 0 to 360 with longitudes that go from -180 to 180 so that the outputs are standardized and everything works well with regions? This seems like something which would go in themodel` creation

spencerkclark commented 5 years ago
  1. Is there a way to define alt_names for a coordinate so that we can deal with something which is X, lon, or longitude?

Indeed coordinates with similar but different names are so common that we have built a number of commonly-used internal names into aospy (in internal_names.py):

# All attributes associated with data's spatiotemporal grid.
GRID_ATTRS = OrderedDict(
    [(LAT_STR, ('lat', 'latitude', 'LATITUDE', 'y', 'yto', 'XLAT')),
     (LAT_BOUNDS_STR, ('latb', 'lat_bnds', 'lat_bounds')),
     (LON_STR, ('lon', 'longitude', 'LONGITUDE', 'x', 'xto', 'XLONG')),
     (LON_BOUNDS_STR, ('lonb', 'lon_bnds', 'lon_bounds')),
     (ZSURF_STR, ('zsurf', 'HGT')),
     (SFC_AREA_STR, ('area', 'sfc_area')),
     (LAND_MASK_STR, ('land_mask', 'LANDFRAC', 'XLAND')),
     (PK_STR, ('pk',)),
     (BK_STR, ('bk',)),
     (PHALF_STR, ('phalf',)),
     (PFULL_STR, ('pfull',)),
     (PLEVEL_STR, ('level', 'lev', 'plev')),
     (TIME_STR, ('time', 'XTIME')),
     (TIME_WEIGHTS_STR, ('time_weights', 'average_DT',)),
     (TIME_BOUNDS_STR, ('time_bounds', 'time_bnds')),
     (BOUNDS_STR, ('bounds', 'bnds', 'nv', 'nbnd')),
     (RAW_START_DATE_STR, ('raw_data_start_date',)),
     (RAW_END_DATE_STR, ('raw_data_end_date',))]
)

Unlike in the case of Var objects (which you can define however you would like, e.g. with your own custom alt_names), these are currently hard-coded into aospy (i.e. currently there is no easy way for the user to customize them without changing the source code directly). We would eventually like to change this -- see discussion in https://github.com/spencerahill/aospy/issues/182.

In the meantime, it looks like of the examples you listed, we are currently missing only 'X'. We would be happy to accept a pull request adding that as well.

Is there a way to join longitudes that go from 0 to 360 with longitudes that go from -180 to 180 so that the outputs are standardized and everything works well with regions?

I think #266 addresses this (though it's not in a released version of aospy yet). I'll let @spencerahill comment more since he is more familiar with the change.

jdossgollin commented 5 years ago

The first part of this is addressed by #296 , though only (X, Y, T) are implemented and not any vertical coordinates (such as pressure coordinates), which would also be helpful.

The second part of this remains open.

spencerahill commented 5 years ago

Is there a way to join longitudes that go from 0 to 360 with longitudes that go from -180 to 180 so that the outputs are standardized and everything works well with regions? This seems like something which would go in the model creation

If I'm understanding your use case correctly, the new Longitude class and logic @spencerkclark referred to should take care of this. Since the v0.2.1 release with this is 2+ weeks coming (c.f. comment in #294, sorry for the delay), in the meantime perhaps you could try installing from develop and seeing if it does indeed take care of your needs. Does that work?

spencerahill commented 5 years ago

not any vertical coordinates (such as pressure coordinates), which would also be helpful

Is your need simply to add an alternate name for the pressure varaible? Is it for pressure-interpolated data, sigma coordinates, or some other coordinate?

jdossgollin commented 5 years ago

That should work -- everything I'm trying to work with at the moment has been interpolated onto pressure coordinates. And in fact, if time is marked as T and we want to avoid confusion there, maybe the best solution is just to do a rename in the preprocess function?

spencerahill commented 5 years ago

do a rename in the preprocess function

Great idea. Let us know if that doesn't work.

What is the name of the pressure coordinate in your data? That could be handled with a rename in your preprocces function as well, but in the future we could also add it to internal_names

spencerahill commented 5 years ago

Sorry, I see from #296 you answered this question already:P for pressure level. I think we could go ahead and include that in #296.

jdossgollin commented 5 years ago

Hey @spencerkclark, do you think you might be able to help me figure out what's wrong with the CI for a PR I wrote on this issue? https://github.com/jdossgollin/aospy/branches then see build fail at more_internal_names_293. I think the problem is that I haven't correctly implemented your earlier suggestion to add a @pytest.mark.xfail to test_load_variable_does_not_warn in test/test_data_loader.py since everything else seems to be working OK. Thanks very much!

spencerkclark commented 5 years ago

Sure thing @jdossgollin -- could you do me a favor and go ahead and submit a PR from your branch? I think it will be easier for me to access the build log/review the code that way.