Open DanielAdriaansen opened 4 years ago
I know this doesn't isn't reusable for other calculations, but if 'P' is a 1D dimension of 'TD', maybe you could do:
-ds.sel(P=slice(pbot, ptop)).integrate('P') * 1 / rho / g
@DanielAdriaansen Thanks for opening. This is definitely something we want to enable now that we have a fairly large set of useful calculations. Not sure on a schedule for that, though...
Agreed, thanks for opening!
precipitable_water
still only works on 1D profiles. The main stopping point is MetPy's get_layer
functionality, which does not support gridded data. This is a prime candidate to be updated to use xarray in a future release, which would also allow things like SRH (https://github.com/Unidata/MetPy/issues/1258) and bulk shear to permit gridded data, where the only thing stopping them from being vectorized is get_layer
. Not that it is directly in the interest of this issue, but CAPE and other parcel profile-dependent calculations, however, would take a bit more work to support gridded data due to the vertical iteration needed (see https://github.com/Unidata/MetPy/issues/480).I guess another way is to use map blocks (must be chunked with dask) and then perform your loop inside so that it's kind of parallelized.
def xr_precipitable_water(ds):
da_prws = []
for lat in ds['lat']:
for lon in ds['lon']:
ds_sub = ds.sel(lat=lat, lon=lon)
ds_sub['prw'] = precipitable_water(
ds_sub['dpt'].values * units('degC'),
ds_sub['level'].values * units('hPa')
).m
da_prw = ds_sub['prw']
da_prws.append(da_prw.expand_dims(['lat', 'lon']))
da_prw = xr.combine_by_coords(da_prws)
return da_prw
ds = ds.chunk({'lat': 37, 'lon': 36})
template = ds.isel(level=0, drop=True)[['air']].rename({'air': 'prw'})
ds_prw = xr.map_blocks(xr_precipitable_water, ds, template=template).load()
Executed in 2 mins 6 seconds (tested on my personal machine with 4 cores)
Compared to the absolute serial case
da_prws = []
for lat in ds['lat']:
for lon in ds['lon']:
ds_sub = ds.sel(lat=lat, lon=lon)
ds_sub['prw'] = precipitable_water(
ds_sub['dpt'].values * units('degC'),
ds_sub['level'].values * units('hPa')
).m
da_prw = ds_sub['prw']
da_prws.append(da_prw.expand_dims(['lat', 'lon']))
da_prw = xr.combine_by_coords(da_prws)
Executed in 2 mins 35 seconds
Entire script:
from metpy.calc import precipitable_water, dewpoint_from_relative_humidity
from metpy.units import units
import xarray as xr
ds = xr.open_mfdataset('*.mon.mean.nc').sortby('lat')
ds['dpt'] = (
('level', 'lat', 'lon'),
dewpoint_from_relative_humidity(
ds['air'].values * units('degC'), ds['rhum'])
)
def xr_precipitable_water(ds):
da_prws = []
for lat in ds['lat']:
for lon in ds['lon']:
ds_sub = ds.sel(lat=lat, lon=lon)
ds_sub['prw'] = precipitable_water(
ds_sub['dpt'].values * units('degC'),
ds_sub['level'].values * units('hPa')
).m
da_prw = ds_sub['prw']
da_prws.append(da_prw.expand_dims(['lat', 'lon']))
da_prw = xr.combine_by_coords(da_prws)
return da_prw
ds = ds.chunk({'lat': 37, 'lon': 36})
template = ds.isel(level=0, drop=True)[['air']].rename({'air': 'prw'})
ds_prw = xr.map_blocks(xr_precipitable_water, ds, template=template).load()
da_prws = []
for lat in ds['lat']:
for lon in ds['lon']:
ds_sub = ds.sel(lat=lat, lon=lon)
ds_sub['prw'] = precipitable_water(
ds_sub['dpt'].values * units('degC'),
ds_sub['level'].values * units('hPa')
).m
da_prw = ds_sub['prw']
da_prws.append(da_prw.expand_dims(['lat', 'lon']))
da_prw = xr.combine_by_coords(da_prws)
I have an Xarray Dataset object and it has 3D variables of atmospheric data stored in it. Essentially, a collection of nx X ny columns of data. I am attempting to compute precipitable water using this 3D grid, but have been unsuccessful doing it any other way besides looping in nx and ny, storing the result from
precipitable_water()
in an np ndarray, and then adding it to my Xarray Dataset object. I tried various flavors of xarrayapply_ufunc()
, but I don't think I am advanced enough of a user to get it to broadcast theprecipitable_water()
function across my Dataset correctly. Or maybe it's not even possible.In MetPy, It would be nice to be able to pass these variables in and get back a 2D nx X ny grid of precipitable_water back into my Dataset, or as a separate DataArray object since the dimensions are reduced from 3D to 2D in this calculation.
Something like:
pw_da = mpcalc.precipitable_water(ds['P'],ds['TD'],bottom=pbot,top=ptop)
where pw_da would be a DataArray object since two Dataset objects were passed into the function.
I made the issue generic, since there may be other indices, or calculations that fit this model in MetPy that I haven't thought about where the user would want to distill 3D data into a 2D field of some sort.
After perusing current issues and PR's, I found #1490 , and this may help me but I am not entirely sure.
I originally posted a question on SO awhile back here: https://stackoverflow.com/questions/61255329/pythonic-way-to-compute-precipitable-water, where @dopplershift alluded to some assumed 1D behavior at least in
precipitable_water()
, so in addition to Xarray-type enhancements there is probably also some 1D to 3D changes needed to achieve this.