pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.62k stars 1.08k forks source link

can't save wrf Mercator projection attribute to new netcdf file #2252

Closed morganeoneill closed 6 years ago

morganeoneill commented 6 years ago

I'm brand new (TODAY) to xarray, so I'm not going to use the right lingo... I want to save a wrf diagnostic variable 'pvo' (https://wrf-python.readthedocs.io/en/latest/diagnostics.html#diagnostic-table) to its own new netcdf file.

dataset = xr.Dataset({'pvo':pvo})
dataset.info()
Coordinates:
    XLONG    float32 -101.73564
    XLAT     float32 1.7164688
    Time     datetime64[ns] 2014-09-15T21:00:00
xarray.Dataset {
dimensions:
    bottom_top = 42 ;
    south_north = 243 ;
    west_east = 378 ;

variables:
    float32 XLONG(south_north, west_east) ;
    float32 XLAT(south_north, west_east) ;
    datetime64[ns] Time() ;
    float32 pvo(bottom_top, south_north, west_east) ;
        pvo:FieldType = 104 ;
        pvo:MemoryOrder = XYZ ;
        pvo:description = potential vorticity ;
        pvo:units = PVU ;
        pvo:stagger =  ;
        pvo:coordinates = XLONG XLAT ;
        pvo:projection = Mercator(stand_lon=-65.0, moad_cen_lat=23.722755432128906, truelat1=30.0, truelat2=60.0, pole_lat=90.0, pole_lon=0.0) ;
        pvo:_FillValue = 9.969209968386869e+36 ;
        pvo:missing_value = 9.969209968386869e+36 ;

Looks great, let's save it (am I doing this right?)

pvo.to_dataset().to_netcdf("analysis_d01_2014-09-15_21h.nc","a")

It breaks and complains about the projection attribute:

Traceback (most recent call last):
  File "matplotwrf.py", line 86, in <module>
    pvo.to_dataset().to_netcdf("analysis_d01_2014-09-15_21h.nc","a")
  File "/software/Anaconda3-5.1.0-el6-x86_64/envs/wrf-python/lib/python3.6/site-packages/xarray/core/dataset.py", line 1150, in to_netcdf
    compute=compute)
  File "/software/Anaconda3-5.1.0-el6-x86_64/envs/wrf-python/lib/python3.6/site-packages/xarray/backends/api.py", line 659, in to_netcdf
    _validate_attrs(dataset)
  File "/software/Anaconda3-5.1.0-el6-x86_64/envs/wrf-python/lib/python3.6/site-packages/xarray/backends/api.py", line 120, in _validate_attrs
    check_attr(k, v)
  File "/software/Anaconda3-5.1.0-el6-x86_64/envs/wrf-python/lib/python3.6/site-packages/xarray/backends/api.py", line 111, in check_attr
    'files'.format(value))
TypeError: Invalid value for attr: Mercator(stand_lon=-65.0, moad_cen_lat=23.722755432128906, truelat1=30.0, truelat2=60.0, pole_lat=90.0, pole_lon=0.0) must be a number string, ndarray or a list/tuple of numbers/strings for serialization to netCDF files

Do I have to chop up, rework or remove the projection attribute? Surely these diagnostic variables are immediately ready to save as complete netcdf variables, what am I doing wrong?

Thank you!

darothen-cc commented 6 years ago

Hi @morganeoneill, welcome to the world of xarray!

It looks like when you create the xarray Dataset, the "projection" attribute on pvo is being set as some sort of object, and the netCDF engine doesn't know how to write it to a file. I'm not familiar with wrf-python, so maybe it's a special map projection object that the package uses internally for some sort of plotting or other function?

Regardless, the easiest solution is just to drop that attribute. You can try something like

del dataset['pvo'].attrs['projection']

Since your dataset has 2D coordinates for longitude and latitude (XLONG and XLAT), you may not need that projection information for anything. If it's something you definitely need to keep, you may want to try to decode the wrf-python 'Mercator' object into a string or something else - strings are totally fine to save as attributes on Datasets or DataArrays, and should serialize to NetCDF without any problem.

morganeoneill commented 6 years ago

Excellent, that's so simple! I don't know what I'd need it for, and I certainly don't need it for cursorily zooming around in ncview. Thanks a lot @darothen-cc, you're wicked fast!

fmaussion commented 6 years ago

Note that the XLAT XLON coordinates of WRF files are not sufficient to retrieve the WRF projection in the current format of your dataset.

wrf-python or salem need the original dataset attributes to parse the projection information (STAND_LON, TRUELAT1 & 2, etc). wrf-python should probably store their projection as a string rather than an object, though.

(more infos about wrf projections)