Closed wknoben closed 8 months ago
Tnx for pointing this out.
Yes, I am aware of this, we need at least a 3by3bytime variable to be able to move forward with the current code. Bart faced the same issue for very small cases and the solution was to subset the forcing with margins. I understand that might not be desirable to inflate the source nc file 9 times larger.
I can easily include option 1 for a graceful exit.
option 2 can be accommodated by taking in a native resolution of the gridded data that will help create the single grid without surrounding grids. The code can give a graceful exit while asking for esmr.source_nc_resoluton. In this case, if provided a single grid can be created. I think the creation of a single grid and following the current workflow in the code is a better approach than directly manipulating the remapping file. There might be cases where the target shapefile and the single grid may have no or partial overlap which need to be checked first by intersection.
I should give this more thought on how to handle 1 grid case but I am confident both 1 and 2 should be addressed in version 2.0.0 by end of this month!
Meanwhile, the option to move forward is:
1- buffer for subsetting of gridded data
2- manually creating the one grid shapefile and feed to easymore by source_shp
similar to an irregular case (may face some issues as the dimensions of lat and lon may be different from case 3 in which the lat and lon with have the same dimension name such as n
).
@wknoben if you pass me the target shapefile and the snippet to netcdf file with one grid, I can set up an example and test the changes to the source code. tnx in advance.
Here's the beginning of a function that pads an existing netcdf file with empty values. This means a 1x1xTime domain can be expended to a 3x3xTime domain, after which easymore should work.
def add_empty_grid_cells_around_single_cell_netcdf(file,
grid_spacing=0.25, grid_cells=3,
lat_dim='latitude', lon_dim='longitude', tim_dim='time'):
'''Takes an existing netCDF4 file with latitude and longitude dimensions of size 1, and adds np.nan cells around this'''
ds = xr.open_dataset(small_src_file)
# Get existing latitude and longitude values, and define the extended coordinates
old_lat = ds[lat_dim].values[0]
old_lon = ds[lon_dim].values[0]
new_lat = np.linspace(old_lat - grid_spacing, old_lat + grid_spacing, 3)
new_lon = np.linspace(old_lon - grid_spacing, old_lon + grid_spacing, 3)
# Create a grid_cells-by-grid_cells-by-time set of NaNs
new_data = np.empty( (len(ds[tim_dim]), grid_cells, grid_cells) ) # Note that dim order must match what's used below
new_data[:] = np.nan
# Create a new xarray dataset with the expanded grid and data
new_ds = xr.Dataset( coords={lon_dim: new_lon, lat_dim: new_lat, tim_dim: ds[tim_dim]})
for variable in ds.variables:
if variable not in [lat_dim, lon_dim, tim_dim]:
new_var = xr.DataArray(new_data, dims=(tim_dim, lat_dim, lon_dim), name=variable) # Note that order of dimensions here must match NaNs above
new_ds[variable] = new_var.copy() # This copy() is ESSENTIAL - without it, whenever you update the variable values below, you update these values WHEREVER this new_var is inserted in the dataset, i.e., everywhere
new_ds[variable].attrs = ds[variable].attrs
new_ds[variable].loc[ {tim_dim: ds[tim_dim], lon_dim: ds[lon_dim], lat_dim: ds[lat_dim]} ] = ds[variable][:] # Copy existing values
#assert all(new_ds[variable].isel(latitude=1,longitude=1) == ds[variable]), f'{variable} values not correctly copied' # slow
assert (new_ds[variable].isel(latitude=1,longitude=1) - ds[variable]).sum() == 0, f'{variable} values not correctly copied' # faster
return new_ds
Test case:
ERA5_1966-11_pressure_variables.zip
import numpy as np
import xarray as xr
small_src_file ='ERA5_1966-11_pressure_variables.nc'
ds = xr.open_dataset(small_src_file)
ds_ext = add_empty_grid_cells_around_single_cell_netcdf(small_src_file, grid_spacing=0.25)
(ds_ext.isel(latitude=1,longitude=1) == ds).all() # selects the center lat & lon cells of new dataset, then compares against old
@wknoben Version 2.0.0 currently on the main
branch should have this issue fixed. It should work with netcdf files with one grid, with 1 row or 1 column grids. The version is not yet on PyPI and needs local installation. Virtual env example is here.
Once you confirm, I will close this issue.
Note that loading version 2 is a bit different.
from easymore import Easymore # for version 2 and above
The rest of the code remains the same.
Use case:
User (for whatever reason) tries to remap a netcdf file that only has spatial dimensions of size 1 (i.e., a single latitude and longitude value; 1 grid cell).
Current behaviour:
EASYMORE crashes on calling
nc_remapper()
:Desired behavior:
Options:
self.source_nc
to see if at least one ofself.var_lat
andself.var_lon
is larger than 1. If not, gracefully exit and inform user that for this file remapping is not needed.