Open jthielen opened 6 years ago
Is it worth allowing overwriting rather than telling people to do:
my_data.attrs()['standard_name'] = 'air_temperature'
I'm not sure...that definitely seems like a good approach for DataArrays, but since (I'd presume) this is most useful on Datasets, that approach would end up like
my_data['temperature_isobaric'].attrs()['standard_name'] = 'air_temperature'
my_data['relative_humidity_isobaric'].attrs()['standard_name'] = 'relative_humidity'
my_data['geopotential_height_isobaric'].attrs()['standard_name'] = 'geopotential_height'
versus something less verbose such as
data.metpy.parse_cf(variables={'temperature_isobaric': 'air_temperature',
'relative_humidity_isobaric': 'relative_humidity',
'geopotential_height_isobaric': 'geopotential_height'})
I'd prefer the second, but what do you think?
Do you have a current set of data where this feature is necessary?
In regards to the feature of systematic identification itself, it's mostly just the motivating example mentioned above at this point, but I could also see it opening up possibilities for calculations in the future if the user just passed a dataset, and the function could pull out what it needed.
In regards to filling in the standard_name
, most sets of data I've been working with would need this, especially since most of the GRIB-converted data coming from THREDDS servers I've used are missing the standard_name
attribute (this includes the NARR and Irma GFS examples in staticdata
). Also, no surprise, but non-post-processed WRF output seems to lack it as well.
But, based on actually looking into this now and finding how common it is for datasets to be missing the standard_name
attribute, would it be necessary to have programmatic ways of identifying the type of variable for this to be practical? Or is a different approach not based on standard_name
needed? (If so either way, it seems like something that would take too much effort to be worked on right now.)
Corresponding to #860, it would seem useful to also be able to systematically identify variables from an xarray
Dataset
. A simple use-case would be something like what motivated this issue, #662, where we want to identify each of the components of the 3D wind field and then do some calculations on those. This also would likely be a prerequisite for #3 (whenever enough pieces are in place for that to be implemented).A initial approach could be simply searching for the
standard_name
attribute and strictly adhering to the CF Standard Name list, while giving some option to the user to supply a dictionary to fill standard names where they are missing. However, would there be cases where we don't have a CF standard name for the quantity we want? Or, should there be some kind of automatic processing to fill in for missingstandard_name
attributes? But, then again, anything too much more flexible/complex would likely become even messier than systematic coordinate identification ended up being.