Closed Mikejmnez closed 1 year ago
While at NOC I've rewritten and optimised the code to find sections (I'm pretty sure I eventually wanted to implement it in OceanSpy).
If it's of use, it's available here: https://github.com/JMMP-Group/nordic-seas-validation/blob/main/nsv/section_finder.py
It has been used on NEMO data only, so you might have to change a few things to make it work with MITgcm output (e.g., naming convention for Arakawa grids).
Actually, I realised you don't have access to that repo. I'll share the code next week!
Thanks @malmans2 ! I have also been writing optimized code following your suggestion some time ago to look at xoak
, and coincidentally the part that I needed to work on to improve it was this circle path option.
Looking forward to work on implementing your changes on oceanspy
.
Here is the code:
from dataclasses import dataclass
import cf_xarray # noqa: F401
import numpy as np
import xoak # noqa: F401
from xarray import DataArray, Dataset
@dataclass
class SectionFinder:
"""
Args:
ds_domain (Dataset): domain_cfg dataset
"""
ds_domain: Dataset
def __post_init__(self):
self._grids = {}
@property
def grids(self) -> dict:
"""Dictionary mapping each grid to a dataset with its coordinates"""
if not self._grids:
for grid in ("u", "v", "t", "f"):
ds = Dataset(
coords={
"lon": self.ds_domain[f"glam{grid}"],
"lat": self.ds_domain[f"gphi{grid}"],
}
)
ds = ds.squeeze(drop=True)
for key, value in ds.sizes.items():
ds[f"{key}_index"] = DataArray(range(value), dims=key)
self._grids[grid] = ds.cf.guess_coord_axis()
return self._grids
def nearest_neighbor(self, lons, lats, grid: str) -> Dataset:
"""
Given the coordinates defining a section, find the nearest points
on a model grid.
Args:
lons (1D array-like): Longitudes defining a section
lats (1D array-like): Latitudes defining a section
grid (string): Model grid `{"u", "v", "t", "f"}`
Returns:
Dataset: Dataset with model coordinates and indexes
"""
if not self.grids[grid].xoak.index:
self._grids[grid].xoak.set_index(("lat", "lon"), "sklearn_geo_balltree")
return self.grids[grid].xoak.sel(lat=DataArray(lats), lon=DataArray(lons))
nearest_neighbour = nearest_neighbor
def zigzag_section(self, lons, lats, grid: str) -> Dataset:
"""
Given the coordinates defining a section, find the correspoinding zigzag section
on a model grid.
Args:
lons (1D array-like): Longitudes defining a section
lats (1D array-like): Latitudes defining a section
grid (string): Model grid `{"u", "v", "t", "f"}`
Returns:
Dataset: Dataset with model coordinates and indexes
"""
def diff_and_inds_where_insert(ix, iy):
dx, dy = (np.diff(ii) for ii in (ix, iy))
inds = np.argwhere(np.abs(dx) + np.abs(dy) > 1).squeeze()
return dx, dy, inds
# Extract indexes
ds = self.nearest_neighbor(lons, lats, grid)
ix, iy = (ds[f"{i}_index"].data for i in ("x", "y"))
# Remove duplicates
mask = np.abs(np.diff(ix)) + np.abs(np.diff(iy)) == 0
ix, iy = (np.delete(ii, np.argwhere(mask)) for ii in (ix, iy))
# Initialize variables
dx, dy, inds = diff_and_inds_where_insert(ix, iy)
# Stop when inds is empty
while inds.size:
# Extract diffs
dx, dy = (di[inds] for di in (dx, dy))
# Insert new values
# Single diagonal step: move along y direction first
mask = np.abs(dx * dy) == 1
ix = np.insert(ix, inds + 1, ix[inds] + (dx / 2).astype(int))
iy = np.insert(
iy, inds + 1, iy[inds] + np.where(mask, dy, (dy / 2).astype(int))
)
# Prepare for next iteration
dx, dy, inds = diff_and_inds_where_insert(ix, iy)
# Find new lat/lon
ds = self.grids[grid].isel(
x=DataArray(ix, dims=ds.dims), y=DataArray(iy, dims=ds.dims)
)
return self.nearest_neighbor(ds["lon"], ds["lat"], grid)
def velocity_points_along_zigzag_section(self, lons, lats) -> dict:
"""
Given the coordinates defining a section, find the corrisponding velocity points
along a zigzag section (f-grid). Useful to compute accurate volume fluxes.
Args:
lons (1D array-like): Longitudes defining a section
lats (1D array-like): Latitudes defining a section
Returns:
dict: Dictionary mapping u/v grids to their coordinates and indexes.
"""
# ZigZag path along f
ds = self.zigzag_section(lons, lats, "f")
# Find dimension name
dim = list(ds.dims)[0]
ds[dim] = ds[dim]
# Compute diff and find max index of each pair
ds_diff = ds.diff(dim)
ds_roll = ds.rolling({dim: 2}).max().dropna(dim)
# Fill dictionary
ds_dict = {}
for grid in ("u", "v"):
# Apply mask
# Meridional: u - Zonal: v
ds = ds_roll.where(
ds_diff[f"{'y' if grid == 'u' else 'x'}_index"], drop=True
)
da_dim = ds[dim]
if not ds.sizes[dim]:
# Empty: either zonal or meridional
continue
# Find new lat/lon
ds = self.grids[grid].isel(
x=DataArray(ds["x_index"].astype(int), dims=dim),
y=DataArray(ds["y_index"].astype(int), dims=dim),
)
ds = self.nearest_neighbor(ds["lon"], ds["lat"], grid)
# Assign coordinate - useful to concatenate u and v after extraction
ds_dict[grid] = ds.assign_coords({dim: da_dim - 1})
return ds_dict
Example usage:
ds_domain = xr.open_mfdataset("domain_cfg.nc")
ds_tgrid = xr.open_mfdataset("output_grid_T.nc")
finder = SectionFinder(ds_domain)
stations = finder.nearest_neighbor(
lons=(lon0, lon1, lon2, ...),
lats=(lat0, lat1, lat2, ...),
grid="t"
)
section = ds_tgrid.isel({dim: stations[f"{dim}_index"] for dim in ("x", "y")})
A few comments:
xarray>2022.03.0
, I believe because of xarray's indexing refactor. It should be an easy fix though.domain_cfg.nc
to initialise the finder, which is NEMO's file that describes the Arakawa grids. Then you have a few functions to get the indexes of the section. zig_zag
functions are similar to OceanSpy's concept of mooring.Great, I'll start working on this later in the week.
closed by #354
Description
@ThomasHaine tried to compute a mooring section on LLC4320 using Kogur lats and lons from the documentation, and ran into the following error:
where
cut_od
is the cutout dataset using the same lats and lons. The data lives entirely onface=2
so there is actually no transformation happening prior to the mooring calculation.I looked into this and found where the error takes place.
subsample.mooring
computes and finds iteratively the locations of the mooring section, creating a continuous line passing through Center points. At each iteration, given two lats and lons on the grid, oceanspy generates an intermediate value withoceanspy.utils.great_circle_path
(when using spherical coordinates) and proceeds to find its nearest neighbor. Then repeats this procedure until there is no gaps and thus the section is continuous.The issue happens in the specific example below, in which oceanspy needs to fill the gaps inbetween the (lat0, lon0) shown as a blue dot, and (lat1, lon1) shown as red dot. These two were computed iteratively by oceanspy using the Kogur section. But the next iteration fails.
From the figure, there are clearly two additional intermediate grid points to be found between the two given coordinates points. Oceanspy finds intermediate points on a sphere with the following code:
The issue is that
ospy.utils.great_circle_path
fails to find any intermediate value. This is, boththis_Y
andthis_X
are empty. Hence the error above.The relevant code is in
subsample.py
lines 745-757.Note that
dist=1.8410331034092235
is arbitrarily divided by a factor of 2.1. In the source codesubsample.py
lines 756 there is a comment that states thatDivide dist by 2.1 to make sure that returns 3 points
.I checked, and
dxC
anddyC
are about half of the value ofdist
. I tried dividingdist
by factors of 10 and larger, but that didn't solve the problem.I haven't tested this example with ECCO. I am not sure if this bug can happen in ECCO and therefore in all LLC-grids, or if its a matter of resolution + grid.
If the issue is related to proximity between coordinate (end) points + complex geometry, then we can probably get the inbetween grid point by moving from
(lat1, lon1)
towards(lat0, lon0)
one grid point at a time. Since we know their location in index space.Something like