pangeo-data / xESMF

Universal Regridder for Geospatial Data
http://xesmf.readthedocs.io/
MIT License
183 stars 32 forks source link

Regridder error: cannot reshape array of size {} into shape {} #315

Open tmurakami7 opened 7 months ago

tmurakami7 commented 7 months ago

Specific error: ValueError: cannot reshape array of size 67184640000 into shape (360, 720, 360, 720).

if self.size != reduce(operator.mul, shape, 1): raise ValueError( "cannot reshape array of size {} into shape {}".format(self.size, shape) )

Result:

huard commented 7 months ago

Thanks for the detailed report and fix. Would you be willing to submit a Pull Request ?

nilsleh commented 4 days ago

I just ran into the same Issue: ValueError: cannot reshape array of size 6803400000 into shape (np.int32(100), np.int32(100), np.int32(9860), np.int32(69))

malmans2 commented 4 days ago

Same here - it might be a NumPy 2.0-related issue that is not caught by CI

malmans2 commented 4 days ago

To give some context: I encountered a similar issue recently, and downgrading numpy resolved it. I plan to open a ticket about this, but I don't have a MRE at the moment.

@nilsleh, if it's easy for you to test this and it’s the same problem, maybe you have an MRE?

nilsleh commented 4 days ago

Sure, here is an example .nc file in google drive and the below code:

import xarray as xr
import xesmf as xe
import numpy as np

dataset = xr.open_dataset("example_data.nc")

lats = dataset.latitude.values
lons = dataset.longitude.values

grid_lats = np.linspace(lats.min(), lats.max(), 100)
grid_lons = np.linspace(lons.min(), lons.max(), 100)

target_grid = xr.Dataset({'lat': (['lat'], grid_lats),
                          'lon': (['lon'], grid_lons)})

regridder = xe.Regridder(dataset, target_grid, 'bilinear')
regridded_ds = regridder(dataset)

regridded_ds['time'] = dataset['time']

yields:

ValueError: cannot reshape array of size 6803400000 into shape (np.int32(100), np.int32(100), np.int32(9860), np.int32(69))

I also tried changing the data type, however, then in the sparse package it does not seem to recognize the types here

malmans2 commented 4 days ago

Not sure what's causing the issue, but I can confirm that downgrading numpy fixes it.

aulemahal commented 4 days ago

So this issue was because of changes to numpy's dtype changes (see https://numpy.org/doc/stable/numpy_2_0_migration_guide.html#changes-to-numpy-data-type-promotion and see PR).

With numpy 2.0 , the shape tuple is in int32 like shown above and a size like "6803400000" is impossible as it exceeds int32's maximum (2^31 - 1). Before, the shape tuple was automatically casted to python's int (so int64).

Before we get a release of xESMF with a fix, users can keep using numpy 2.0 with this hacky solution:

import xarray as xr
import xesmf as xe
import numpy as np

dataset = xr.open_dataset("example_data.nc")

lats = dataset.latitude.values
lons = dataset.longitude.values

grid_lats = np.linspace(lats.min(), lats.max(), 100)
grid_lons = np.linspace(lons.min(), lons.max(), 100)

target_grid = xr.Dataset({'lat': (['lat'], grid_lats),
                          'lon': (['lon'], grid_lons)})

regridder = xe.Regridder(dataset, target_grid, 'bilinear')

# Fix for numpy 2 while waiting for xESMF 0.8.7
regridder.shape_in = tuple(map(int, regridder.shape_in))
regridder.shape_out = tuple(map(int, regridder.shape_out))

regridded_ds = regridder(dataset)

regridded_ds['time'] = dataset['time']