ocean-eddy-cpt / gcm-filters

Diffusion-based Spatial Filtering of Gridded Data
https://gcm-filters.readthedocs.io/
Other
37 stars 24 forks source link

Tutorial on vector laplacian #74

Closed NoraLoose closed 3 years ago

NoraLoose commented 3 years ago

This adds a tutorial on how to do viscosity-based filtering with a vector Laplacian, picking up #49.

Note: The tutorial reveals the full pain that is caused by the fact that our filters are not able to handle different dimensions on a staggered grid. See the tutorial subsections Removing padding and swapping dimensions and Swapping back dimensions re-adding the padding (which take up most of the tutorial). Do we want to put the user through this pain, or do we want to integrate gcm-filters with xgcm sooner rather than later? I know what we decided at our last zoom meeting, but looking at this tutorial may change people's opinion. Happy to hear feedback. Maybe @jbusecke and @rabernat have opinions?

review-notebook-app[bot] commented 3 years ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

codecov-commenter commented 3 years ago

Codecov Report

Merging #74 (42dec0d) into master (1365cbe) will not change coverage. The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #74   +/-   ##
=======================================
  Coverage   98.47%   98.47%           
=======================================
  Files           7        7           
  Lines         719      719           
=======================================
  Hits          708      708           
  Misses         11       11           
Flag Coverage Δ
unittests 98.47% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1365cbe...42dec0d. Read the comment docs.

hmkhatri commented 3 years ago

@NoraLoose Thanks for the tutorial.

dxT , dyT, dxBu, dyBu are generally provided as a part of the model grid. Most of the users would not require to compute these arrays. Also, in NW2 output, number of grid points are different on q and t points, len(xq) = len(xh) + 1 and len(yq) = len(yh) + 1. For many model outputs, the number of grid points is the same on q and t points. So, padding removal would not be required.

For swapping dimensions, is it possible to handle this operation inside gcm-filters for now? The names of the coordinates can be supplied as a list while calling the apply_to_vector. Renaming and swapping can be performed internally, so the users would not have to deal with it.

NoraLoose commented 3 years ago

@hmkhatri thanks for your comments!

Also, in NW2 output, number of grid points are different on q and t points, len(xq) = len(xh) + 1 and len(yq) = len(yh) + 1. For many model outputs, the number of grid points is the same on q and t points. So, padding removal would not be required.

I agree that if len(xq) = len(xh) and len(yq) = len(yh), the situation becomes quite a bit easier (since we don't have to deal with removing/adding padding). Are you suggesting to get rid off the complicated regional NeverWorld2 example, and instead work with global MOM6 data, where len(xq) = len(xh) and len(yq) = len(yh) are satisfied?

Integration with xgcm would be nice because xgcm understands the differences between the two cases we are discussing (positions outer vs. right).

For swapping dimensions, is it possible to handle this operation inside gcm-filters for now?

As it is implemented now, this is not handled inside gcm-filters. All inputs need to have the same dimensions:

https://github.com/ocean-eddy-cpt/gcm-filters/blob/ca4795055c119d857f628ef3f2ba46fc1654785e/gcm_filters/filter.py#L410-L411

Do you see a straightforward way to change this?

hmkhatri commented 3 years ago

@NoraLoose Yes, I suggest that it would be better to avoid NW2 example and just use a global output that has len(xq) = len(xh) and len(yq) = len(yh), which is the norm for most of the climate outputs.

Even in MOM6, we generally can ignore the additional columns used for padding in xq, yq, see here.

Regarding the second point, I agree having xgcm type handling of coordinates would be ideal. Given that, we just need to change the coordinate names to xh, yh in all grid distance variables for gcm-filters to work, it is easier to perform this operation within gcm-filters before any filtering computations are performed.

For example, you used the following in the example notebook, dxCu = xr.DataArray(ds_static.dxCu.isel(xq=slice(1,None)), coords={'yh':ds.yh,'xh':ds.xh}, dims=('yh','xh'))

This operation only changes the coordinates and it could be done inside gcm-filters without a need for users to do this.

NoraLoose commented 3 years ago

Yes, I suggest that it would be better to avoid NW2 example and just use a global output that has len(xq) = len(xh) and len(yq) = len(yh), which is the norm for most of the climate outputs. Even in MOM6, we generally can ignore the additional columns used for padding in xq, yq, see here.

I like this idea because it would make for a much cleaner tutorial. I'm working on a global example with non-symmetric output (len(xq) = len(xh)).

It would still be nice to provide a quick user guide on what to do if you have symmetric output (len(xq) = len(xh) + 1) like NW2 output, maybe at the end of the tutorial. @hmkhatri would you be interested in working together on such a tutorial subsection? In the link you sent, you recommend the user to write MOM6 output in non-symmetric mode in the first place, correct? Maybe we can come up with some easy offline alternatives (which can operate lazily).

This operation only changes the coordinates and it could be done inside gcm-filters without a need for users to do this.

Could be an option. Should we open a separate issue to discuss this?

hmkhatri commented 3 years ago

I like this idea because it would make for a much cleaner tutorial. I'm working on a global example with non-symmetric output (len(xq) = len(xh)).

It would still be nice to provide a quick user guide on what to do if you have symmetric output (len(xq) = len(xh) + 1) like NW2 output, maybe at the end of the tutorial. @hmkhatri would you be interested in working together on such a tutorial subsection? In the link you sent, you recommend the user to write MOM6 output in non-symmetric mode in the first place, correct? Maybe we can come up with some easy offline alternatives (which can operate lazily).

sure, I can contribute to the tutorial on how to handle output on the symmetric grid.

This operation only changes the coordinates and it could be done inside gcm-filters without a need for users to do this.

Could be an option. Should we open a separate issue to discuss this?

I guess opening a separate issue would be useful.

NoraLoose commented 3 years ago

The tutorial now uses non-symmetric MOM6 data from a global 1/10 degree simulation. (Thanks @gustavo-marques for providing the model output!) Working with non-symmetric model output massively simplifies the tutorial.

@hmkhatri, would you like to review the updated version? At the end, I also started a section "Dealing with symmetric model output". Do you have ideas for a user guide in such a case (e.g., for NeverWorld2 output)?

review-notebook-app[bot] commented 3 years ago

View / edit / reply to this conversation on ReviewNB

hmkhatri commented on 2021-06-02T20:51:16Z ----------------------------------------------------------------

It would nice to mention that the swapping is only done because gcm-filters cannot distinguish between lat, lons at different grid positions. The swapping does not mean that data is actually moved to xh, yh grid points.


review-notebook-app[bot] commented 3 years ago

View / edit / reply to this conversation on ReviewNB

hmkhatri commented on 2021-06-02T20:51:17Z ----------------------------------------------------------------

It is easy to convert symmetric data to non-symmetric data. We can use the following method and use the same approach as shown in the filtering examples above.

ds = ds.isel(xq = slice(1,241), yq=slice(1,561))

ds_static = ds_static.isel(xq = slice(1,241), yq=slice(1,561))


hmkhatri commented 3 years ago

@NoraLoose The tutorial looks good to me. I have made small suggestions (see comments above).

NoraLoose commented 3 years ago

I have implemented your suggestions, @hmkhatri. Thanks so much for your comments - they helped improving the tutorial a lot.

Anyone else wants to review this tutorial?

rabernat commented 3 years ago

I will give it a look now.

review-notebook-app[bot] commented 3 years ago

View / edit / reply to this conversation on ReviewNB

rabernat commented on 2021-06-03T21:05:07Z ----------------------------------------------------------------

I would move this comment above the code cell. It's an important but subtle point.


review-notebook-app[bot] commented 3 years ago

View / edit / reply to this conversation on ReviewNB

rabernat commented on 2021-06-03T21:05:07Z ----------------------------------------------------------------

Could we just use scalars? Does it require an array?


NoraLoose commented on 2021-06-03T21:49:24Z ----------------------------------------------------------------

As implemented now, the vector Laplacian requires arrays for kappa_iso and kappa_aniso.

https://github.com/ocean-eddy-cpt/gcm-filters/blob/ca4795055c119d857f628ef3f2ba46fc1654785e/gcm_filters/kernels.py#L351-L352

This implementation follows the way kappa_w and kappa_s are handled in the IRREGULAR_WITH_LAND (PR #28, still open). In https://github.com/ocean-eddy-cpt/gcm-filters/pull/28#issuecomment-839332923, it is also discussed to make kappa_w and kappa_s optional arguments, with default 1. We could do the same here for kappa_iso and kappa_aniso.