JiaweiZhuang / xESMF

Universal Regridder for Geospatial Data
http://xesmf.readthedocs.io/
MIT License
269 stars 49 forks source link

Conservative regridding of DataArray, N+1 dim issue #14

Closed NicWayand closed 6 years ago

NicWayand commented 6 years ago

Is there a temporary workaround to regridding DataArrays using the conservative method that avoids the issue raised here (https://github.com/pydata/xarray/issues/1475) of DataArrays not being able to store N+1 lat_b/lon_b variables? Drop down to numpy arrays?

JiaweiZhuang commented 6 years ago

You can use either numpy arrays or xarray.Dataset. A Dataset (not DataArray) can hold variables with different dimensions. There is an example in my preliminary cubedsphere package. In that example, you can use ds.isel(tile=i) as the input/output grid object for xesmf.Regridder().

NicWayand commented 6 years ago

Thank you that is working now using a Dataset.

However, I am getting the below ESMF_REGRID error (in PET0.ESMF_LogFile) when using the conservative option to regrid some GFDL output on a tri-polar grid to a stereo projection over the north pole. The bilinear options works fine. I have double checked the lat_b and lon_b, and they look right. Have you run into this ESMF error before? Or have any suggestions on what to try? (I will try to set up a repeatable example shortly).

20180226 143244.758 ERROR            PET0 ESMF_Regrid.F90:357 ESMF_RegridStore Invalid argument  - Internal subroutine call returned Error
20180226 143244.870 ERROR            PET0 ESMF_FieldRegrid.F90:1611 ESMF_FieldRegridStoreNX Invalid argument  - Internal subroutine call returned Error
20180226 143244.870 ERROR            PET0 ESMF_FieldRegrid.F90:656 ESMF_FieldRegridStoreFile Invalid argument  - Internal subroutine call returned Error
20180226 143244.870 ERROR            PET0 ESMF_Field_C.F90:1115 f_esmf_regridstorefile Invalid argument  - Internal subroutine call returned Error
JiaweiZhuang commented 6 years ago

No I haven't encountered this error before. Is the "tri-polar grid" still quadrilateral?

JiaweiZhuang commented 6 years ago

Also, are you using GFDL ocean models? I thought all GFDL atmospheric models use the cubedsphere grid...

NicWayand commented 6 years ago

Yes, I am looking at sea ice output (grid info: http://nomads.gfdl.noaa.gov/CM2.X/oceangrid.html) and the grid is quadrilateral.

JiaweiZhuang commented 6 years ago

OK I've seen FIG. 7 in Murray (1996) and know how the grid looks like. Maybe ESMF doesn't like the 2 singularities in the bipolar grid. Does the error still exist if you discard the 2 singularities? Say, subset the coordinate by lon[1:-1, :] (or by lon[:, 1:-1], depending on the dimension order)

NicWayand commented 6 years ago

Ok, I think your are right, the conservative option works when I try to remove the singularities.

Here is a reproducible snippet: https://github.com/NicWayand/regrid_test/blob/master/Test_Conservative_Regridding.ipynb

``

JiaweiZhuang commented 6 years ago

Glad that it works. The singularities are on the land so they should be masked anyway and shouldn't affect your results?

Closing this issue. Feel free to reopen if any bug occurs.

NicWayand commented 6 years ago

Well, when I remove the top row to remove the singularity points it removes the whole row, so there is a gap across the "seam" between the two poles, which is not good. I don't know of a way to just remove the singularities.

Reading a bit in the ESMF documentation, it should be able to handle the singularities at poles, but maybe an option needs to be specified?

JiaweiZhuang commented 6 years ago

Well, when I remove the top row to remove the singularity points it removes the whole row, so there is a gap across the "seam" between the two poles, which is not good. I don't know of a way to just remove the singularities.

Shouldn't only the leftmost/rightmost "row" is removed, which is almost as small as a single cell?

Taken from Figure 4.6 in the technical guide:

screen shot 2018-02-27 at 1 54 56 pm

Also, in your code you seem to crop ds_target too much. Its size is bigger than ds_in but you use the same slice for both Dataset. Could that be a problem?

Reading a bit in the ESMF documentation, it should be able to handle the singularities at poles, but maybe an option needs to be specified?

I believe the "polar singularity" there means the longitude can be periodic. It knows to connect -180 and 180. But if the longitude actually has the same value along the entire row then ESMF will get confused.

NicWayand commented 6 years ago

The actual grid I have is a combination of the regular grid (below 65N) and the bipolar grid (as pictured in Fig 4.6). Below is the ds_in grid. Red is the full grid. Blue dots are the top row removed.

image

Yes my slicing was too big for both datasets. I have updated the code to show just slicing the top "row" of ds_in.

I actually don't understand why I need to slice the stereo projected ds_target. If there are no input cells over the singularities/poles, I would expect xesmf to map missing to the new grid at those locations (or otherwise handle it fine). However, it only works when I remove the surrounding "edge" cells... which is strange.

I checked the the longitudes are not the same, so should "wrap" fine.

JiaweiZhuang commented 6 years ago

Ah I see! Although your grid has multiple "tiles", it is stored as a single 2D numpy array! I was thinking about cubedsphere-like layout that multiple tiles are stored separately as multiple arrays. Then it is totally expected that ESMF will fail with the two singularities, as they are singularities in the middle of the array that will screw up the grid searching. On the other hand, singularities at corners (like the normal polar singularity) should be fine.

If you break your grid into two separate arrays, one for the bipolar tile and one for the normal lat-lon grid, I believe the conservative regridding will work just fine. If the error still occurs, you can just remove one row in the bi-polar tile. The removed cells will be localized near the singularities and will not span over a long line. If you slice the original 2-tile grid, the normal lat-lon tile will be affected.

NicWayand commented 6 years ago

Thanks for the suggestion! Upon splitting up the grid I realized I incorrectly made my lat_b and lon_b, which may be the underlying issue. The problem is I start with bounds per grid cell (i.e. lat_corners = (Ngrids, 4 corners) and need to convert it to the N+1 format xesmf requires. I thought this step would be trivial but its hard to generalize as each of my different input/target grids have different assumptions about the orientation of the "4 corners". Is there any way to pass in the (Ngrids, 4 corners) format of bounds to xesmf by chance?

JiaweiZhuang commented 6 years ago

Is there any way to pass in the (Ngrids, 4 corners) format of bounds to xesmf by chance?

This would be very tricky. I know that the CF convention uses latbnd(n,m,4), but such specification is quite inconvenient in practice.

  1. plt.pcolormesh() expects lat_b(n+1,m+1), not latbnd(n,m,4)
  2. ESMF/ESMPy expects lat_b(n+1,m+1), not latbnd(n,m,4)
  3. latbnd(n,m,4) has ~75% redundant information that is not guaranteed to be consistent with the rest ~25%. Using it for function input is not safe. The function might pick-up whatever 25% depending on the scheme of choice (see below for an example scheme). If there is some error in the unused 75%, it will be hard to notice.

To be safe, I would suggest calculating lat_b(n+1,m+1) from latbnd(n,m,4) by hand and double checking the result. Something like

As you said, there are many possible orientations, so the bound indices I use here (0, 1, 2, 3) are probably wrong and need tweaking. Automating that is out of the scope of xESMF; xgcm might be more relevant (#13). But I believe it is much safer to hand-tweak.

NicWayand commented 6 years ago

Thanks @JiaweiZhuang, I hope xgcm will handle this conversion in the future!

That is the way I am calculating it now. With correct bounds, the conservative option IS running on the bipolar grid... but its not handeling the seam between the two poles correctly (values of sea ice concentration range from 0 to 1, but its setting cells within this seam to values > 1 for some reason. Maybe this is related to #15?

Conservative: image

Bilinear also has issues in the "seam" image

Nearest source to destination works as I expected it to: image

https://github.com/NicWayand/regrid_test/blob/master/Test_Tri_Poler_Regridding.ipynb

JiaweiZhuang commented 6 years ago

Hmm this is weird... Do you have the FDLFLOR_gridinfo.nc and stereo_gridinfo.nc used in your notebook?

JiaweiZhuang commented 6 years ago

Also the tile you are regridding is the bipolar grid? Then you should set periodic=False as its boundary is not periodic. It is much similar to the two polar tiles in the cubed-sphere grid, or the rasm example in xESMF tutorial, which just spans over the pole but is actually a regional tile.

NicWayand commented 6 years ago

Sorry, I uploaded the required nc files and the esio.py module. Everything should be there to run https://github.com/NicWayand/regrid_test/blob/master/Test_Tri_Poler_Regridding.ipynb now. But let me know if something is missing.

Hmm the GFDL model is global, I am just regridding to the north pole region, so I thought because the longitude wraps around the earth it is periodic? Regardless, you get missing "seams" when periodic=False (as in the updated notebook).

Thanks for your help!

JiaweiZhuang commented 6 years ago

@NicWayand

I've fixed your problem in this notebook (really took me a while!): https://github.com/JiaweiZhuang/regrid_test/blob/master/debug_bipolar_grid.ipynb

The problem is your source grid is not a legal, single-tile quadrilateral grid. First, the polar tile had better be extracted out. Second, the polar tile itself has a buggy layout -- the numpy array contains 2 disconnected parts. By rearranging the two part correctly there is no missing "seam" any more. My notebook has detailed explanations.

In general, the grid object fed to xesmf.Regridder() must be a well-defined, monotonically increasing, single-tile quadrilateral grid. A rule of thumb is if it can be plotted correctly with plt.pcolormesh, then xESMF should be OK with it too. If the grid coordinate array has a messy layout, such as containing multiple disconnected tiles, then the grid search would be messed-up.

NicWayand commented 6 years ago

Thank you so much @JiaweiZhuang!! That is a strange way to lay out the grid, there most be some computational reason why it was so. I was able to use the same method on the lat/lon bounds and got the conservative regridding working as expected, so closing this issue.