Closed cehbrecht closed 3 years ago
@Zeitsperre why do we need a special implementation for this? I would suppose it works on xarray ...
From our experience, it doesn't work right out of the box. You could try implementing a cleaner version without the handling but I think your results will be cut at the median line or the dateline, depending on the GCS of the netCDF data.
We might be able to use the xarray
method roll
. This allows you view the coordinate space from a different frame of reference:
http://xarray.pydata.org/en/stable/generated/xarray.DataArray.roll.html
Hence you could:
If I recall correctly, the reason why we didn't want to use the "roll" approach was because of the high processing load needed when processing large (>250GB) NetCDF files. It made much more sense to slice up the vector file to better suit the shape of the NetCDF than to change all the NetCDF data with roll.
@agstephens I've been implementing this but haven't opened a PR yet as I've spotted a few issues with what I've done and I'm looking into them. Here are the changes I've made https://github.com/roocs/clisops/compare/cross_prime_meridian_fix
The issues are:
https://github.com/roocs/clisops/blob/3c810ba98dbfb87fc0a6869de7810ece068802dc/clisops/utils/dataset_utils.py#L6 I'm not sure how well this will work with the test data as they have only a few longitudes e.g. 0, 250
https://github.com/roocs/clisops/blob/3c810ba98dbfb87fc0a6869de7810ece068802dc/clisops/utils/dataset_utils.py#L27-L29 Here I am checking whether the request is within the bounds of the dataset - however it is very strict and if I ask for a subset (0, 100) but the minimum longitude is 0.01 this tries to roll the dataset. The same issue might arise in the case of subsetting again after a first subset in a workflow.
https://github.com/roocs/clisops/blob/3c810ba98dbfb87fc0a6869de7810ece068802dc/clisops/utils/dataset_utils.py#L10 I'm calculating how to roll to end up with longitude -180 to 180, so I think I need to change this to be more generic.
Here's what I've come up with:
If there's only one longitude - return the dataset as it is and don't try to roll. Don't think this solves the test issue but does look at the possible case of one longitude.
Change the check to something like if (lon_min <= low or np.isclose(low, lon_min, atol=0.5)) and (lon_max >= high or np.isclose(high, lon_max, atol=0.5)):
Instead of rolling the first longitude value to -180, roll it to the lower bound of the requested subset longitude bounds e.g.
low, high = lon_bnds
first_element_value = low diff, offset = calculate_offset(lon, first_element_value) ds_roll = ds.roll(shifts={f"{lon.name}": offset}, roll_coords=False) ds_roll.coords[lon.name] = ds_roll.coords[lon.name] + diff return ds_roll
with
def calculate_offset(lon, first_element_value):
# get resolution of data
res = lon.values[1] - lon.values[0]
# calculate how many degrees to move by to have lon[0] of rolled subset as lower bound of request
diff = first_element_value - lon.values[0]
# work out how many elements to roll by to roll data by 1 degree
index = 1 / res
# calculate the corresponding offset needed to change data by diff
offset = int(diff * index)
return diff, offset
@ellesmith88: I've read through the code and it's looking great!
Here are my responses to your three points:
If there is only one longitude: it's a special case, which could probably be solved by:
roll
here because data array remains the sameChange the check to something like if (lon_min <= low or np.isclose(low, lon_min, atol=0.5)) and (lon_max >= high or np.isclose(high, lon_max, atol=0.5))
:
Instead of rolling the first longitude value to -180, roll it to the lower bound of the requested subset longitude bounds:
Great stuff.
Description
The
core.subset
module is missing the implementation for "crossing 0 degree meridian". https://github.com/roocs/clisops/blob/c493bc57ac46591ff9fe0fae77e58eb48e8e31df/clisops/core/subset.py#L114What I Did