Open gutmann opened 5 years ago
Removing preprocess=preproc
seems to fix this... but of course then it isn't casting to a float and will likely need more memory. Is there a better way to do that in xarray now?
Very simply breaking case :
import xarray as xr
files = ['/gpfs/fs1/work/jhamman/storylines/storylines_test_data/downscaling/bcsd/cnrm-cm5_rcp45_r1i1p1/conus_c5.cnrm-cm5_rcp45_r1i1p1.daily.pr.1950.nc',
'/gpfs/fs1/work/jhamman/storylines/storylines_test_data/downscaling/bcsd/cnrm-cm5_rcp45_r1i1p1/conus_c5.cnrm-cm5_rcp45_r1i1p1.daily.pr.1951.nc']
# This works fine
ds = xr.open_mfdataset(files)
def cast_to_float(ds):
return ds.astype(np.float32)
preproc = cast_to_float
# this breaks
ds = xr.open_mfdataset(files,
preprocess=preproc,
engine='netcdf4').load()
Causes the same Broadcast ValueError crash
I've tried to run the test case described in the storylines workflow and I get an error I don't understand :
I copied over the test case
storylines_test_data
to my work dir, and changed thetest_config.yml
working dir from flash to a directory in my scratch space. The key error seems to beValueError in line 67 of /gpfs/fs1/work/gutmann/storylines/chains/downscaling.snakefile: cannot broadcast shape (6, 2) to shape (365, 6, 365)
line 67 is :Looking at the files, all seem to have a primary variable that is (time, lat, lon) and all seem to have the same lat, lon, bounds dimensions. All bounds_lat, bounds_lon variables seem to have the correct dimensions. The broadcast error above looks like there is a file with
bounds_latitude(time, latitude, time)
, but I don't see anything like that. Any ideas?