jbusecke commented 2 years ago

As part of #408 I have been thinking a lot about face connections and the associated padding of arrays.

Here, I would like to propose two things:

1) An expansion of xgcm's capabilities to apply any operation on both orthogonal and tangential vector components, and include reverse face connections. 2) A simplified API which makes the use of helper methods like interp_2d_vector obsolete.

1) Expanded padding functionality

Introduction

In the following I will assume that a dataset with face connections is given by an array of 2+ dimensional data that is stacked along an additional face dimension. Vector components on each face indicate the axis directionality of a quantity on each face.

To operate on such a dataset we need to enable 'exchanges' between faces, which are given by face connections. In practice this means we have to pad various sides of each face with the appropriate data from another face. Operations can then be applied to each face, and after trimming the padding result in the same shape output, but the operations respect the connectivity of the faces.

Methodology

In practice, this means that we iterate through each face (target), and then again iterate through all target faces, identify the data used for padding (source), modify this and concatenate it with the target.

As I pointed out previously for tracer fields, there are only 4 general cases that need to be considered:

For scalar data we only need to worry about flipping the source data appropriately before padding. We either flip the data along the axis of concatenation (orthogonal flip) or along the 'other' axis (tangential flip).

Considerations for vector components

This logic is however not enough to accommodate vector data. For vectors data, two additional things need to be taken into account. If the connection is switching axes, the vector components need to be exchanged (e.g. pad values of v of one face to the u data of another). Secondly, in certain cases the sign of the components needs to be changed. I think I have captured the relevant cases for orthogonal and tangential vector components with the blue arrows in this picture:

Blank diagram-5

This gives a clear pattern: The sign of tangential vectors has to be changed if a tangential flip is performed, and the sign of orthogonal vectors has to be changed if an orthogonal flip is performed.

If this actually incorporates all possible scenarios I can integrate this relatively easily in my existing padding logic, if I can infer these things in there: a) Is the dataarray passed to the padding a vector component b) Is the vector component orthogonal/tangential c) What is the vector partner for the input

We have most of these informations when we go through the helper grid methods like Grid.interp_2d_vector, but I am not entirely happy with this API design. Since we are refactoring a lot of the internals I would suggest a reconsideration of the API.

2) API changes

Ultimately I want users to be able to do something like grid.interp(u) for a dataset with complex grid topology. For that we need to store/provide the above information somewhere. I see a couple of options here.

Storing all information in the grid object
```
grid = Grid(..., vector_components = {'X':[u, u_trans, dic_x_trans], 'Y':[v, v_trans, dic_y_trans]},...)
grid.interp(u)
```
This would provide the necessary information at the point of grid creation, but might lead to issues if the user changes names of data_arrays.

Providing all information to the grid method

grid.interp(u, vector_axis='X', vector_partner={'Y':v})

Adapt the signature of current helper functions for the grid methods
```
u_interp,v_interp = grid.interp({'X':u,'Y':v})
```

Ultimately we might choose to enable all of these, but I wanted to gather some feedback from everyone here. I am also curious if there are CF conventions that might help us here (cc @dcherian).

In general it would be very valuable to hear from folks if you think that I have adequately captured all possible cases here (cc @rabernat) and what you all think about possible API changes.

rabernat commented 2 years ago

Julius I just read through this carefully. Amazing work! I really do think you have captured all of the important cases. The tangential vector exchange in particularly is not currently supported (see #117), so this refactor also provides an opportunity to improve our functionality significantly.

I am strongly opposed to "Storing all information in the grid object". This really breaks the current abstractions of xgcm.

Instead, we need to have methods that accept and return vectors. I think the type Dict[str, xr.Datarray], where the keys are axis names, is sufficient. This is what is currently implemented. Alternatively, we could use an xr.Dataset.

TomNicholas commented 2 years ago

I think the type Dict[str, xr.Datarray], where the keys are axis names, is sufficient. This is what is currently implemented. Alternatively, we could use an xr.Dataset.

As they both xr.Dataset and Dict[str, xr.Datarray] ultimately follow the Mapping Protocol, we might be able to write code that accepts either.

TomNicholas commented 2 years ago

Instead, we need to have methods that accept and return vectors.

@rabernat do you mean "we need to have additional methods that only accept and return vectors (e.g. interp_vector)", or "we need to upgrade our standard methods (e.g. interp) to also accept and return vectors". I think the API suggestions above were motivated by wanting to have the latter.

rabernat commented 2 years ago

The latter

jbusecke commented 2 years ago

Thanks for the feedback @rabernat. I think I do see your point here:

I am strongly opposed to "Storing all information in the grid object". This really breaks the current abstractions of xgcm.

I think I will focus on making option 3 work then (but returning a dict not a tuple, not sure why I wrote the returns that way🤷‍♂️).

jbusecke commented 2 years ago

Actually I am still curious if any of the conventions like sgrid/CF provide the needed information for vector fields in the metadata. If that is the case, I would be very motivated to build an API that could potentially accept a single dataarray and infer the needed information from such metadata. This would lean more into the direction of option 2 above.

jbusecke commented 2 years ago

One more understanding question for @rabernat. We currently only allow to move vector components to 'center'. Should this remain restricted? From the understanding above, I do not see why it would be a problem to have both vector components defined e.g. on a tracer point. Am I missing something?

dcherian commented 2 years ago

I didn't figure out how to decode sgrid but I think it mostly talks about relative positions of points (https://sgrid.github.io/sgrid/)

jbusecke commented 2 years ago

Yeah I looked through it and did not find anything either...

TomNicholas commented 2 years ago

@jbusecke and I were just chatting and decided that we like the idea of using an API like this for acting on vectors

grid.interp({"X", da_u}, axis="X", other_component={"Y", da_v}),

where the Dict[str, DataArray] type indicates "this quantity is a single component of a vector, and its the component pointing parallel to axis [str]".

This API meets Julius' requirements:

a) Is the dataarray passed to the padding a vector component b) Is the vector component orthogonal/tangential c) What is the vector partner for the input

We know (a) because we check if the input is a Dict or not, we know (b) from looking at the axis str + the information in the grid, and we pass (c) explicitly.

This API also neatly differentiates from the case of interpolating a scalar variable, meaning that we can dispatch internally, and would be able to use the same grid.interp method for both cases, instead of grid.interp_scalar and grid.interp_vector. (The other_component argument would just not be valid if the user passed a scalar.)

(This syntax also has the pleasing property that it could even work for ND vectors, even though I know it's highly unlikely we ever need to implement face connection logic for that.)

I was wondering if we should go even further and define a simple Vector dataclass which means the syntax would be

grid.interp(Vector(da_u, "X"), axis="X", other_component=Vector(da_v, "Y"))?

This is more explicit, but I think is overkill unless there was a need for methods we defined on a Vector class (e.g. .project)?

TomNicholas commented 2 years ago

This API would also need to be propagated down to grid.pad, because we will need the option to pad variable a using a piece from variable b. This doesn't have to be public API though.

Another question is whether or not we want to support Dataset as a valid argument type to these functions (and to apply_as_grid_ufunc in general). xarry.apply_ufunc does accept Datasets, but I think all it does it map the ufunc over every data variable. If we did want to support passing Datasets then we would need to be quite careful with our type checking.

(We decided that supporting interp-ing multiple components of a vector at once was probably not really needed, as you could always just do your operation in two lines component-by-component.)

jbusecke commented 2 years ago

Hey folks,

I just wanted to document some of the things possible while implementing this feature.

I am using ECCO as an example here:

from intake import open_catalog
import xarray as xr
import numpy as np

cat = open_catalog("https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/ocean.yaml")
ds = cat.ECCOv4r3.to_dask()

# just surface and one time step
ds = ds.isel(k=0, time=0)

and set up the grid with proper face connections

face_connections = {'face':
                    {0: {'X':  ((12, 'Y', False), (3, 'X', False)),
                         'Y':  (None,             (1, 'Y', False))},
                     1: {'X':  ((11, 'Y', False), (4, 'X', False)),
                         'Y':  ((0, 'Y', False),  (2, 'Y', False))},
                     2: {'X':  ((10, 'Y', False), (5, 'X', False)),
                         'Y':  ((1, 'Y', False),  (6, 'X', False))},
                     3: {'X':  ((0, 'X', False),  (9, 'Y', False)),
                         'Y':  (None,             (4, 'Y', False))},
                     4: {'X':  ((1, 'X', False),  (8, 'Y', False)),
                         'Y':  ((3, 'Y', False),  (5, 'Y', False))},
                     5: {'X':  ((2, 'X', False),  (7, 'Y', False)),
                         'Y':  ((4, 'Y', False),  (6, 'Y', False))},
                     6: {'X':  ((2, 'Y', False),  (7, 'X', False)),
                         'Y':  ((5, 'Y', False),  (10, 'X', False))},
                     7: {'X':  ((6, 'X', False),  (8, 'X', False)),
                         'Y':  ((5, 'X', False),  (10, 'Y', False))},
                     8: {'X':  ((7, 'X', False),  (9, 'X', False)),
                         'Y':  ((4, 'X', False),  (11, 'Y', False))},
                     9: {'X':  ((8, 'X', False),  None),
                         'Y':  ((3, 'X', False),  (12, 'Y', False))},
                     10: {'X': ((6, 'Y', False),  (11, 'X', False)),
                          'Y': ((7, 'Y', False),  (2, 'X', False))},
                     11: {'X': ((10, 'X', False), (12, 'X', False)),
                          'Y': ((8, 'Y', False),  (1, 'X', False))},
                     12: {'X': ((11, 'X', False), None),
                          'Y': ((9, 'Y', False),  (0, 'X', False))}}}
grid = Grid(
    ds, 
    coords={'X':{'center':'i', 'left':'i_g'}, 'Y':{'center':'j','left':'j_g'}},
    face_connections=face_connections
)

we can now use the internal padding logic to pad the array. Lets start with a tracer field

t_padded = pad(
    ds.THETA,
    grid,
    {'X':(20,20), 'Y':(20,20)},
    boundary='fill',
)
t_padded.plot(
    col='face',
    col_wrap=4,
    vmax=20,
    vmin=-2,
)

This works like before, but you are able to pad with more than just one cell (here with 20 on each side!).

But that is not so interesting, lets move on to the padding of vector components.

First, lets try to treat the vector component as a tracer (which is overly naive)

u = ds.UVELMASS
v = ds.VVELMASS
u_padded_naive = pad(
    u,
    grid,
    {'X':(10,10),'Y':(10,10)},
    boundary='fill'
)
u_padded_naive.plot(
    col='face',
    col_wrap=4,
    robust=True
)

Well this (expectedly) looks wrong. You can see a clear disconnect on several faces (where the wrong component of velocity is used for padding).

Lets use our new vector component API

u = ds.UVELMASS
v = ds.VVELMASS

u = u.reset_coords(drop=True).drop(['j', 'i_g'])
v = v.reset_coords(drop=True).drop(['j_g', 'i'])

v_padded = pad(
    {'Y':v},
    grid,
    {'X':(10,10),'Y':(10,10)},
    other_component={'X':u},
    boundary='fill',
)
v_padded.plot(
    col='face',
    col_wrap=4,
    robust=True
)

This looks pretty good! Each face looks like its padded with visually resonable values on each side.

Couple of things remain here:

[ ] The corners are not filled with actual values, but according to the boundary parameter. This will prevent any stencil operation that does not look like a 'cross' to return bad values in the corner. I think we can live with that for now, but this might be an issue for more adanced grid ufuncs later.
[ ] There is still some things in the current logic that need to be ironed out.
- [ ] few failing tests
- [ ] Need a better handling(resetting) of coordinates (see my example above).
- [ ] Calling pad results in a bunch of /srv/conda/envs/notebook/lib/python3.9/site-packages/dask/array/core.py:450: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison o = func(*args, **kwargs) warnings. I am not sure where these come from yet, and if I should be worried?
[ ] This currently only works when calling the pad function, and we need to enable passing dict objects (+additional keyword args) to the grid methods. I think this might be something that we should do before merging this here?

rabernat commented 2 years ago

Great work Julius!

The corners are not filled with actual values, but according to the boundary parameter.

That's a fine compromise for now. Hypothetically it is possible to fill these by finding the neighboring face along the diagonal corner direction, but I don't think that's a high priority. This is already so much better than what we have.

I am not sure where these come from yet, and if I should be worried?

No idea about this warning.

This currently only works when calling the pad function, and we need to enable passing dict objects (+additional keyword args) to the grid methods

The alternative, closer to what we have today in xgcm, is to have separate methods for vectors, as we do today, e.g. diff_2d_vector and interp_2d_vector. I would kind of prefer to keep that API. Do you think it is possible?

jbusecke commented 2 years ago

The alternative, closer to what we have today in xgcm, is to have separate methods for vectors, as we do today, e.g. diff_2d_vector and interp_2d_vector. I would kind of prefer to keep that API. Do you think it is possible?

Ah ok good comment. So there are a few levels here:

The above API is the most general. You can do pretty much 'whatever you want' with a single vector component (as long as you provide the 'other_component'). I like this as the base, since it does not make any assumptions or preselects any cases.

I do however see the appeal in being able to pass a full vector, and get a full vector back (like in diff_2d_vector etc). If we decide to keep diff_2d_vector and interp_2d_vector, it should be trivial to just call the base API from within these functions. In fact that is what I have planned to implement here for full backwards compatibility! We did however already announce that these would be deprecated. But I guess we could also keep them!

A third (harder) possibility would be a way to pass a 'full' vector to the grid methods:

grid.diff({'X':u, 'Y':v}, ...)

and have this return a two element dict. HOWEVER, this gets complicated when we think about the 'axis' along which diff is executed. The ..._2d_ methods make some assumptions (u,v on a c grid, differences/interps taken on x for u, and y for v), but encoding something like that seems hard.

I guess we could do something like

axis={'X':'X', 'Y':'Y'}
grid.diff({'X':u, 'Y':v}, axis)

This would give the user the ability to e.g. compute vorticity components by passing axis={'X':'Y', 'Y':'X'}, but this seems really convoluted to me, especially given the alternative of writing

du_dy = grid.diff({'X':u}, 'Y', other_component={'Y',v})
dv_dx = grid.diff({'Y':v}, 'X', other_component={'X',u})

That is not that much more verbose, but more readable IMO.

Thoughts on whether I am overlooking something, or if we should keep the ..._2d_... methods after all?

Bottom line: I am very convinced that I want the vector padding to work on the most basic (e.g. grid.diff/interp) level, and I think the API for that is ok. On the higher level implementation I am very open to what we should do.

jbusecke commented 2 years ago

cc @TomNicholas

TomNicholas commented 2 years ago

@rabernat when you say this

The alternative, closer to what we have today in xgcm, is to have separate methods for vectors, as we do today, e.g. diff_2d_vector and interp_2d_vector. I would kind of prefer to keep that API. Do you think it is possible?

Are you asking to keep the *_2d_vector method API for backwards compatibility reasons? Because keeping that old API seems inconsistent with your earlier comment here.

I personally think that this API is neat, general enough, and distinct enough between scalars and vectors:

du_dy = grid.diff({'X':u}, 'Y', other_component={'Y',v})
dv_dx = grid.diff({'Y':v}, 'X', other_component={'X',u})

But I want to be clear what the argument is for using different choices of API.

rabernat commented 2 years ago

Yeah I have clearly been very inconsistent in my comments. Thanks for catching this!

I am fine with what I said before--deprecating those methods. We will just need extra logic to try to detect vector inputs in the remaining methods.

Sorry for the confusion.

jbusecke commented 2 years ago

Thanks for clarifying @rabernat. I talked to @TomNicholas yesterday and I think we both think that the sort of API you implemented in the ..._2d_vector_... functions should be put into implementations of higher order vector calculus operators like #187. For those it would make absolute sense to have an API like

div = grid.div({'X':u, 'Y':v, 'Z':w}, ...)

or

{'X':grad_x, 'Y':grad_y} = grid.grad(da, ['X','Y'])

The internal logic here would then go through one of several custom grid_ufuncs to make it more efficient!

jbusecke commented 2 years ago

Awesome. Now that #459 is merged, all I need to add is some docs on how to work with vector input. Should that be a root section in the docs or a subsection?

TomNicholas commented 2 years ago

I think make a new page for vectors.

jbusecke commented 2 years ago

👍

xgcm / xgcm

Extended functionality for vector data with complex grid topologies #410

1) Expanded padding functionality

Introduction

Methodology

Considerations for vector components

2) API changes