Open simonreise opened 5 months ago
The issue is interpolate_na can be much slower than rio.fill.fillnodata
interpolate_na
rio.fill.fillnodata
Simple example:
Reading B9 Landsat band
with rioxarray.open_rasterio(r"path/to/file/B9.tif", chunks = True, lock = True) as tif: b9 = tif.load().chunk('auto') nodata = b9.rio.nodata
Then trying to fill nodata using pure rasterio
rasterio
%%time mask = np.where(b9.data == nodata, 0, 1) b9 = b9.compute() b9.data = rio.fill.fillnodata(b9.data, mask, max_search_distance=250) b9 = b9.chunk('auto')
CPU times: total: 22.9 s Wall time: 1min 33s
1 min is a pretty reasonable time
But if we try to fill nodata using rioxarray interpolate_na
%%time b9 = b9.persist() b9 = b9.rio.interpolate_na()
CPU times: total: 11min 2s Wall time: 27min 25s
Computation is extremely slow (maybe I am doing something wrong?)
The first idea how rio.fill.fillnodata can work with xarray / dask arrays is using xarray.apply_ufunc.
xarray.apply_ufunc
%%time mask = xarray.where(b9.data == nodata, 0, 1) b9 = xarray.apply_ufunc(rio.fill.fillnodata, b9, mask, dask = 'allowed', keep_attrs = 'override', kwargs = {'max_search_distance': 250})
CPU times: total: 29.3 s Wall time: 1min 40s
%%time mask = xarray.where(b9.data == nodata, 0, 1) b9 = xarray.apply_ufunc(rio.fill.fillnodata, b9, mask, dask = 'parallelized', keep_attrs = 'override', kwargs = {'max_search_distance': 250}) b9 = b9.persist()
CPU times: total: 38 s Wall time: 1min 25s
But maybe there are better solutions.
I had thought about using that. Not opposed to using that as an engine for interpolation.
The issue is
interpolate_na
can be much slower thanrio.fill.fillnodata
Simple example:
Reading B9 Landsat band
Then trying to fill nodata using pure
rasterio
CPU times: total: 22.9 s Wall time: 1min 33s
1 min is a pretty reasonable time
But if we try to fill nodata using rioxarray
interpolate_na
CPU times: total: 11min 2s Wall time: 27min 25s
Computation is extremely slow (maybe I am doing something wrong?)
The first idea how
rio.fill.fillnodata
can work with xarray / dask arrays is usingxarray.apply_ufunc
.CPU times: total: 29.3 s Wall time: 1min 40s
CPU times: total: 38 s Wall time: 1min 25s
But maybe there are better solutions.