Closed JiaweiZhuang closed 6 years ago
It remains to be seen whether the numpy implementation of sparse matrix multiplication will be slower than the Fortran implement in ESMPy. But calling a lot of xarray.DataArray.transpose()
doesn't seem to be efficient, too.
I'll let @rokuingh provide more information on the implementation plan for retrieving and working with weights. I don't want to put my foot in my mouth. :smile:
A couple things:
ESMPy requires the input data to have the shape [Nlat, Nlon, N1, N2,...].
ESMPy using Fortran ordering since everything underneath is built on it. Hence when using ncdump
and you see (time, level, lat, lon)
in ESMPy this is (lon, lat, level, time)
. We're brainstorming how to address this, and I won't lie that is very confusing sometimes. I know this is different from in-memory stuff, but I just wanted to bring it up.
Instead of using xarray's transpose() function to match ESMPy's expection, we can write the sparse matrix multiplication directly in numpy, taking care of dimension broadcasting.
A number of times we've encountered issues in user code related to dimension ordering. I don't have a good sense how xarray's transpose
works, but it is often the case that folks want to be using something more like numpy.swapaxes
as opposed to transpose
.
It remains to be seen whether the numpy implementation of sparse matrix multiplication will be slower than the Fortran implement in ESMPy.
My guess is that ESMF is faster. Whether the performance gain is enough to overcome issues in usability is a different question!
A question for you, do you have a sense how array copies work with the xESMF
implementation? I'm pretty sure there is a deep copy going from xarray to ESMPy. Is there another deep copy from ESMPy back to xarray? I'm mostly just curious...don't have a good sense of xarray internals.
Dimension ordering has been the Achilles heel of ESMPy, I suppose that is the price of building on a Fortran package, we are hoping to resolve this soonest!
ESMPy using Fortran ordering since everything underneath is built on it. Hence when using ncdump and you see (time, level, lat, lon) in ESMPy this is (lon, lat, level, time). We're brainstorming how to address this, and I won't lie that is very confusing sometimes. I know this is different from in-memory stuff, but I just wanted to bring it up.
It seems all numpy arrays that point to ESMPy data fields are Fortran-ordered. np.isfortran()
always gives True
. In principle you would expect (time, level, lat, lon)
for C-ordered numpy arrays and (lon, lat, level, time)
for Fortran-ordered ones. Anyway, that would not be a problem anymore if the sparse matrix multiplication is done by numpy directly.
I don't have a good sense how xarray's transpose works, but it is often the case that folks want to be using something more like numpy.swapaxes as opposed to transpose.
I think that's equivalent to np.swapaxes. Again I would want to avoid swapping axes at all, if possible.
A question for you, do you have a sense how array copies work with the xESMF implementation? I'm pretty sure there is a deep copy going from xarray to ESMPy. Is there another deep copy from ESMPy back to xarray? I'm mostly just curious...don't have a good sense of xarray internals.
I am doing deep copy in both directions. I don't think the resulting array is safe if it just points to destfield.data
. (1) It will be overwritten by the next regridding operation if I reuse the regrid object and the underlying field. (2) It will be released by destfield.destroy()
, although it looks like another array. I believe pointers could be very confusing for average Python users.
I actually don't understand why ESMPy uses Field
(containing data and additional dimensions), not Grid
(only the coordinate itself), to construct the Regrid
object. Why we need to call regrid = ESMF.Regrid(sourcefield, destfield)
, not just regrid = ESMF.Regrid(sourcegrid, destgrid)
? @bekozi @rokuingh
My guess is that ESMF is faster. Whether the performance gain is enough to overcome issues in usability is a different question!
I am not sure, considering the current heavy use of memory copies and array rearrangements. We'll need to do a benchmark once regridding weights are visible.
I actually don't understand why ESMPy uses Field (containing data and additional dimensions), not Grid (only the coordinate itself), to construct the Regrid object. Why we need to call regrid = ESMF.Regrid(sourcefield, destfield), not just regrid = ESMF.Regrid(sourcegrid, destgrid)?
This is just how ESMF works underneath which ESMPy attempts to mirror as much as possible. I agree it would be more convenient in cases where only weights are needed to use grids/meshes directly. Here is the underlying ESMF call that ESMPy wraps: http://www.earthsystemmodeling.org/esmf_releases/last_built/ESMF_refdoc/node5.html#SECTION050365900000000000000.
My sincere apologies for the delay on getting you the weight file write code. We have a snapshot tag that includes writing weights to file:
git clone -b ESMF_7_1_0_beta_snapshot_35 https://git.code.sf.net/p/esmf/esmf esmf
The only change to your code should be the addition of a filename argument to the regrid call:
rh = ESMF.Regrid(srcfield, dstfield, filename='/tmp/weights.nc', ...)
Please let me know if you have any questions or find any issues! Though tested, note this is considered "development code".
@bekozi Thanks! I am not worried about any code&API changes because I plan to only interface with the weight file. As long as the format of weights.nc
is stable everything would be fine.
Is that snapshot also available on conda? I only tried to build ESMF a long time ago and forget how to do that now... Or could you just send me a sample weight file? Just lat-lon to lat-lon would be great.
Is that snapshot also available on conda?
It is now. I sat bolt upright last night realizing I forgot to build one. :smile:
conda create -n esmpy -c nesii/label/dev-esmf -c conda-forge esmpy==7.1.0.dev35
@bekozi Thanks! But I got an error when using the filename
keyword. Please see this gist.
The test environment is docker-miniconda3
Thanks for the reproducer. This is a very obscure bug related to Python 3. The code will work in Python 2.7. Will that be okay for a bit? I'll be looking into the Python 3 failure - it's somehow connected to ctypes
.
That wasn't as bad as I predicted. We were not testing a new enough ctypes
version so missed this bug. I tested using docker-miniconda3
and the code works. I'll work on getting a new snapshot. In the meantime, if you want to use Python 3, you can patch the esmpy
source code:
Index: src/addon/ESMPy/src/ESMF/interface/cbindings.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- src/addon/ESMPy/src/ESMF/interface/cbindings.py (revision 6f48610bd5243c1c33cfcd4e6ae9c2788c6c4a56)
+++ src/addon/ESMPy/src/ESMF/interface/cbindings.py (revision 9e1e6544f77b6ae10af92c65ef7ac9b2b761beb6)
@@ -1984,9 +1984,13 @@
raise TypeError('dstMaskValues must have dtype=int32')
dstMaskValues_i = ESMP_InterfaceInt(dstMaskValues)
+ # Need to create a C string buffer for Python 3.
+ b_filename = filename.encode('utf-8')
+ b_filename = ct.create_string_buffer(b_filename)
+
rc = _ESMF.ESMC_FieldRegridStoreFile(srcField.struct.ptr,
dstField.struct.ptr,
- filename,
+ b_filename,
srcMaskValues_i,
dstMaskValues_i,
ct.byref(routehandle),
@bekozi Thanks for looking into this! I am OK with Python2.7 for now.
@bekozi The col
and row
variables seem to have wrong values. Please see this updated gist.
Also, can it also write out grid information as in the Fortran version? area_a
and area_b
will be particularly useful.
Hmmm...I was able to reproduce the issue on my end. Thanks for the code. It doesn't appear the weight file write is appropriately tested on the ESMPy side. I ran the grids through the CLI app and the indexing aligns, so the issue is definitely in the Python interface. I'll see what I can find...may take a bit since I didn't code the original implementation. Sorry for the trouble.
@bekozi That's fine! Take your time.
Hi, @JiaweiZhuang. A new snapshot with a conda build is available that addresses the weight file write. I added a test for your specific case. Again, please let me know if you find any issues!
@bekozi Thanks so much! I've checked that both bilinear and conservative algorithms are working correctly. Now I will have a lot to update in xESMF!
There are two additional issues, though. They are not urgent but kind of confuse me. 1) ESMPy doc says that it can calculate grid area but I can't make it work. See ESMPy_grid_area.ipynb 2) The periodic boundary condition doesn't seem to work. See ESMPy_periodic.ipynb. I can manually extend one more grid point to fix this problem, but I want to make sure I didn't misunderstand the API doc.
The two issues are actually not related to this thread or to xESMF itself. I am willing to continue the discussion here but please do tell me if you want to move to other place to discuss ESMPy-specific things.
Glad to hear it. :sweat_smile:
We can move this discussion over to the ESMF support list. I'll create a ticket and will email you. I'll also create a ticket for getting the grid information into the ESMPy-generated weight file.
Close this issue because ESMPy can now output regridding weights. See #6 for an example.
Will reopen this if there's any bug about weights writing.
Most regridding schemes are linear, i.e. the output data field is linearly dependent on the input data field. Any linear transform can be viewed as a matrix-vector multiplication y = A*x, where A is a matrix (typically sparse) containing regridding weights, and x, y are input and output data fields flatten to 1D.
In practice, linear regridding schemes are broken into two steps:
regrid = ESMF.Regrid(...)
.destfield = regrid(sourcefield, destfield)
, whereregrid
was created in the previous step.However, ESMPy's
regrid
object is like a black box. It knows how to perform regridding but there's no way to explicitly view the regridding weights (the matrix A).In the Fortran version of ESMF, the function
ESMF_RegridWeightGen
dumps regridding weights to NetCDF files. The content of the file is shown in "12.8 Regrid Interpolation Weight File Format" section in ESMF documention (link). The matrix A is stored in a sparse matrix form by variablesrow
,col
andS
in that NetCDF file. But in ESMPy there's no function equivalent toESMF_RegridWeightGen
.Being able to view the regridding weights in the Python-level will solve many troubles:
transpose()
function to match ESMPy's expection, we can write the sparse matrix multiplication directly in numpy, taking care of dimension broadcasting.dask.array
easily and natively. Otherwise, we need to let each dask worker call the underlying Fortran routine separately to regrid each hyberslab -- sounds like a very ugly solution.@bekozi seems to have some Python tools for
ESMF_RegridWeightGen
. Maybe we could start with that.