Closed jdossgollin closed 5 years ago
Glad to hear the contributing guide has been helpful!
Thanks for all the info -- if I understand correctly, I think this is a case for another addition to #296. An addition of 'nbnds'
here should hopefully do the trick.
That looks like it should work -- will try it locally and assess.
More conceptually I think that, as pointed out in #182, we're unlikely to include all variables here, and so some code change that (a) allowed the user to input that mapping, and/or (b) more sophisticated default grid mappings, which defines a "WRF" standard, a "GFDL" standard, etc and then allows the user to choose from one of these defaults (or makes a smart guess?) could be useful.
In case anyone is else is working with this data set, the following pre-processing function (for use with the data loader) is
def preprocess_reanalysis(data_set, **kwargs):
"""Fix time units"""
if kwargs["intvl_in"] == "daily":
data_set["time"].attrs["units"] = "hours since 1800-01-01 00:00:0.0"
data_set = data_set.assign_coords(nbnds=data_set["nbnds"]).rename({"nbnds": BOUNDS_STR})
return data_set
Although I will update nbnds
in #296, the assign_coords
is still necessary here.
Perfect, thanks @jdossgollin. Just curious -- what does the traceback look like if you leave off assign_coords
? We might consider that a bug.
(a) allowed the user to input that mapping
This is what is implemented in #297 :)
Perfect, thanks @jdossgollin. Just curious -- what does the traceback look like if you leave off
assign_coords
? We might consider that a bug.
I get the same ValueError
as before (shown above) -- possibly because the .diff(BOUNDS_STR)
only works if BOUNDS_STR
is a coordinate of the data?
I see, so in the ValueError
the problem seems to be in this line. Indeed I think this is a bug. My gut tells me that this has to do with the fact that BOUNDS_STR
is a dimension without coordinates in your dataset (which should be totally fine); therefore once squeeze
is called in the preceding line, all evidence that BOUNDS_STR
was once in there is gone (so we do not need to drop it).
Does wrapping the drop
in an if statement fix things?
if BOUNDS_STR in time_weights.variables:
time_weights = time_weights.drop(BOUNDS_STR)
ds[TIME_WEIGHTS_STR] = time_weights
That yields AttributeError: 'DataArray' object has no attribute 'variables'
. Perhaps you meant dims
?
EDIT: making that change seems to make it work
if BOUNDS_STR in time_weights.dims:
time_weights = time_weights.drop(BOUNDS_STR)
ds[TIME_WEIGHTS_STR] = time_weights
Oops sorry, I think I meant time_weights.coords
. squeeze
should take BOUNDS_STR
out of time_weights.dims
; if it had associated coordinates then it would hang around as a scalar coordinate (so we would need to drop it).
Hmm -- is there a better way to do this than what I put above? Sorry if I'm missing something obvious
I just meant this:
if BOUNDS_STR in time_weights.coords:
time_weights = time_weights.drop(BOUNDS_STR)
ds[TIME_WEIGHTS_STR] = time_weights
If I recall correctly the reason we have this logic is to remove BOUNDS_STR
if it is a scalar coordinate. I think squeeze
prevents it from being a dimension at this stage.
Got it. I'll include this in upcoming PR.
(b) more sophisticated default grid mappings, which defines a "WRF" standard, a "GFDL" standard, etc and then allows the user to choose from one of these defaults (or makes a smart guess?) could be useful.
@jdossgollin definitely! This is what we want to implement down the road. I believe that has come up somewhere in our Issues in the past...regardless I think #297 is a useful intermediate step.
Agreed -- this definitely makes it easy enough to implement everything I am trying to do now (or plan to do?)
Hi all, I'm working on the PRs (the xarray contrib guidelines are helpful -- am getting up to speed on that). In the meantime I have an issue working with data from NCEP-NCAR Reanalysis, i.e. the data at http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis.dailyavgs/pressure/air.2015.nc.
The data has a
time_bnds
variable but not anything that looks like annv
(the name given for the model boundaries in the example on the online documentation):I'm trying to perform some calculations with this data but am running into some problems, which seem to be coming from https://github.com/spencerahill/aospy/blob/develop/aospy/utils/times.py#L339-342. In particular, a quick read of the source code suggests that the issue is that
ds[BOUNDS_STR]
doesn't exist. From a search of the source code repo, there doesn't seem to be a place whereds[BOUNDS_STR]
is created.If I'm right, that should be a fixable problem -- if not, I'll provide a MWE.
Thanks very much for any thoughts!