miniufo / xgrads

Parse and read ctl and associated binary file commonly used by GrADS into xarray
https://xgrads.readthedocs.io/
MIT License
71 stars 27 forks source link

merging variables of different shapes will change their shapes #49

Open miniufo opened 4 months ago

miniufo commented 4 months ago

Emily reported this issue. For a dataset from WRF output, its ctl is:

dset ^wrfout_d03.dat
options  byteswapped
undef 1.e30
title  OUTPUT FROM WRF V3.6 MODEL
pdef  204 180 lcc  38.799  121.048  102.500   90.500  60.00000  30.00000  119.00000   5000.000   5000.000
xdef  591 linear  114.51905   0.02252252
ydef  400 linear   34.18147   0.02252252
zdef   19 levels  
1000.00000
 950.00000
 900.00000
 850.00000
 800.00000
 750.00000
 700.00000
 650.00000
 600.00000
 550.00000
 500.00000
 450.00000
 400.00000
 350.00000
 300.00000
 250.00000
 200.00000
 150.00000
 100.00000
tdef   85 linear 00Z02AUG2023      60MN      
VARS   16
T2             1  0  TEMP at 2 M (K)
U10            1  0  U at 10 M (m s-1)
V10            1  0  V at 10 M (m s-1)
TSK            1  0  SURFACE SKIN TEMPERATURE (K)
RAINC          1  0  ACCUMULATED TOTAL CUMULUS PRECIPITATION (mm)
RAINNC         1  0  ACCUMULATED TOTAL GRID SCALE PRECIPITATION (mm)
geopt         19  0  Geopotential (m2/s2)
height        19  0  Model height (km)
tk            19  0  Temperature (K)
tc            19  0  Temperature (C)
td            19  0  Dewpoint Temperature (C)
rh            19  0  Relative Humidity (%)
rh2            1  0  Relative Humidity at 2m (%)
u10m           1  0  Rotated wind component (m s-1)
v10m           1  0  Rotated wind component (m s-1)
slp            1  0  Sea Levelp Pressure (hPa)
ENDVARS

Note here that for some variables, like T2 or slp, they are variables that do not depend on lev. They are single-level variables. However, the output dataset from xgrads is:

<xarray.Dataset>
Dimensions:  (time: 85, lev: 19, y: 180, x: 204)
Coordinates:
  * time     (time) datetime64[ns] 2023-08-02 ... 2023-08-05T12:00:00
  * lev      (lev) float64 100.0 150.0 200.0 250.0 ... 850.0 900.0 950.0 1e+03
  * y        (y) float64 0.0 5e+03 1e+04 1.5e+04 ... 8.85e+05 8.9e+05 8.95e+05
  * x        (x) float64 0.0 5e+03 1e+04 ... 1.005e+06 1.01e+06 1.015e+06
Data variables: (12/16)
    T2       (time, lev, y, x) float32 dask.array<chunksize=(1, 19, 180, 204), meta=np.ndarray>
    U10      (time, lev, y, x) float32 dask.array<chunksize=(1, 19, 180, 204), meta=np.ndarray>
    V10      (time, lev, y, x) float32 dask.array<chunksize=(1, 19, 180, 204), meta=np.ndarray>
    TSK      (time, lev, y, x) float32 dask.array<chunksize=(1, 19, 180, 204), meta=np.ndarray>
    RAINC    (time, lev, y, x) float32 dask.array<chunksize=(1, 19, 180, 204), meta=np.ndarray>
    RAINNC   (time, lev, y, x) float32 dask.array<chunksize=(1, 19, 180, 204), meta=np.ndarray>
      ...
    td       (time, lev, y, x) >f4 dask.array<chunksize=(1, 1, 180, 204), meta=np.ndarray>
    rh       (time, lev, y, x) >f4 dask.array<chunksize=(1, 1, 180, 204), meta=np.ndarray>
    rh2      (time, lev, y, x) float32 dask.array<chunksize=(1, 19, 180, 204), meta=np.ndarray>
    u10m     (time, lev, y, x) float32 dask.array<chunksize=(1, 19, 180, 204), meta=np.ndarray>
    v10m     (time, lev, y, x) float32 dask.array<chunksize=(1, 19, 180, 204), meta=np.ndarray>
    slp      (time, lev, y, x) float32 dask.array<chunksize=(1, 19, 180, 204), meta=np.ndarray>
Attributes:
    comment:  TEMP at 2 M
    storage:  0
    title:    OUTPUT
    undef:    1e+30
    pdef:     lcc

in which it is clear that T2 or slp appears to be a lev-dependent variable (i.e., 19 levels).

This is because its lev code, after the variable name T2 or slp in ctl file, is 1, not 0. According to the convention of ctl file, the format of the variable records is as follows:

varname levs units description

The specific convention is: 1718159261025

So in this case, xgrads treats these T2 or slp variables as lev-dependent variables, and when all the variables are merged, T2 or slp will be broadcast into the same shape as those td or rd (truely lev-dependent variables).

The best way to resolve this issue is to change lev code from 1 to 0. But this is auto-generated by WRF postprecess. So I should say, WRF breaks the rule of ctl convention, not xgrads.

miniufo commented 4 months ago

This is not a problem. One can still use the T2 as a 4D variable. But only the first level (here is 1000 hPa) has the correct data, and other levels are actually all nans.