pangeo-data / xESMF

Universal Regridder for Geospatial Data
http://xesmf.readthedocs.io/
MIT License
183 stars 32 forks source link

xe.Regridder for conservative regridding for very hi-res grid not completing #211

Closed jwynnsmith closed 1 year ago

jwynnsmith commented 1 year ago

Hi I have a very high resolution (8962 x 9040) lat lon grid that I want to conservatively regrid to a 1.25 lon x 1.00 lat grid. See the code below. I have run this code on a couple of HPCs. And even after a 7 day wall clock time, the simulation does not finish. I am also trying tips provided here:

My first question with the code below: Is my code efficient enough, if not, where can I improve? Secondly: At the Regridder line below, how can I print the grid index iteration? With this, I would like to at least be able to see where the function is in the regridding process.

Regrid code

import numpy as np import xarray as xr import pandas as pd import glob import xesmf as xe import ESMF ESMF.Manager(debug=False) from netCDF4 import Dataset from calendar import monthrange

nc = Dataset('source_grid_file.nc') file_lats = nc['grid_center_lat'][:]

file_lons = nc['grid_center_lon'][:]

lats = np.linspace(np.nanmin(file_lats), np.nanmax(file_lats), 8962) lons = np.linspace(np.nanmin(file_lons), np.nanmax(file_lons), 9040)

lats_b = np.linspace(np.nanmin(file_lats) - 0.01796907, np.nanmax(file_lats) + 0.01796907, 8963) lons_b = np.linspace(np.nanmin(file_lons) - 0.018092355, np.nanmax(file_lons) + 0.018092355, 9041)

ff_data = np.genfromtxt('./mean_ff_with_flash_counts_2018_to_2020.csv', delimiter=',')

-------- Use xarray to make input grid

ds_in = xr.Dataset( { "lat": (["y"], lats), "lon": (["x"], lons), "lat_b": (["y_b"], lats_b), "lon_b": (["x_b"], lons_b), "glm_ff": (["y", "x"], ff_data), } )

-------- Obtain range of longitude and latitude of the out file (regridd grid)

ds_out = xe.util.grid_2d(-156.25, 5.625, 1.25, -81, 81, 1)

---------- Apply xESMF Regridder

regridder = xe.Regridder(ds_in, ds_out, "conservative_normed", ignore_degenerate=True)

huard commented 1 year ago

I can't spot any problem with your code, apart from the fact that the weights are going to take a large amount of memory.

You may want to compute the weights only as a first step (see https://xesmf.readthedocs.io/en/latest/large_problems_on_HPC.html), save them to disk, then regrid on only one time step, while monitoring memory usage and CPU time.

Let us know how it goes.

jwynnsmith commented 1 year ago

It turns out that since my hi-res grid is lat-lon and I am regridding it to a more coarse lat-lon grid, it was easy enough to do the conservative regridding with cdo:

cdo remapcon,outgrid.txt -selname,var1 -setgrid,ingrid.txt mean_ff_with_flash_counts_2018_to_2020.nc glm_annual_regrid.nc

outgrid.txt contains the coarse grid metadata settings and ingrid.txt are the hi-res grid parameters.

The hi-res lat-lon grid was the result of binning flash counts on the Geostationary Lightning Mapper (GLM) curvilinear grid to a 2 km grid lat-lon grid.

If I need to go from a curvilinear to lat-lon grid, as I did with the GLM flash extent density gridded product, I will use the conservative normed regridding in the xesmf software. Which worked well in that particular case.

huard commented 1 year ago

Ok, will close this, feel free to reopen later on.