barronh / pseudonetcdf

PseudoNetCDF like NetCDF except for many scientific format backends
GNU Lesser General Public License v3.0
76 stars 35 forks source link

Using getproj() with numpy >= 2.0 #149

Open colleenbaublitz opened 1 month ago

colleenbaublitz commented 1 month ago

Using the getproj() function, encountered the following error:

Traceback (most recent call last): File "$PATH/Make_IASI_NH3.py", line 75, in proj = gf.getproj(withgrid=True) File "/home/username/satellite/lib/python3.9/site-packages/PseudoNetCDF/core/_files.py", line 136, in getproj return getproj(self, withgrid=withgrid) File "/home/username/satellite/lib/python3.9/site-packages/PseudoNetCDF/coordutil.py", line 596, in getproj return pyproj.Proj(proj4str, preserve_units=preserve_units) File "/home/username/satellite/lib/python3.9/site-packages/pyproj/proj.py", line 109, in init self.crs = CRS.from_user_input(projparams, kwargs) File "/home/username/satellite/lib/python3.9/site-packages/pyproj/crs/crs.py", line 501, in from_user_input return cls(value, kwargs) File "/home/username/satellite/lib/python3.9/site-packages/pyproj/crs/crs.py", line 348, in init self._local.crs = _CRS(self.srs) File "pyproj/_crs.pyx", line 2378, in pyproj._crs._CRS.init pyproj.exceptions.CRSError: Invalid projection: +proj=lcc +lat_1=np.float64(33.0) +lat_2=np.float64(45.0) +lat_0=40.0 +lon_0=-97.0 +y_0=1728000.0 +x_0=2556000.0 +a=6370000.0 +b=6370000.0 +to_meter=12000.0 +no_defs +type=crs: (Internal Proj Error: proj_create: Error 1027 (Invalid value for an argument): lcc: Invalid value for lat_1 and lat_2: |lat_1 + lat_2| should be > 0)

Barron's solution:

First, the quick fix:

proj4str = '+proj=lcc +lat_0=40 +lon_0=-97 +lat_1=33 +lat_2=45 +x_0=2556000 +y_0=1728000 +R=6370000 +to_meter=12000 +no_defs'
cno = pycno.cno(proj4str)

OR

import cmaqsatproc as csp
gf = csp.open_griddesc('12US1')
cno = gf.csp.cno

Second, the bigger issue:

It looks like this is related to the new version of numpy released last month. The PseudoNetCDF code is using the builtin “repr” function to format the components of the projection. In all previous numpy versions, the repr of np.float64(33.) = '33.0'. (e.g., 1.26.4). In numpy v2, the repr has changed to np.float64(33.0). This breaks the construction of the string. So, right now, PseudoNetCDF is not compatible with numpy v2.

You can update your environment by running python -m pip install 'numpy<2'. Then, it should work… There are several other packages we use that are not yet numpy 2.0 compliant, so it may be best to use a 1.x for now.

barronh commented 1 month ago

This is related to PseudoNetCDF/coordutil.py:getproj4_from_cf_var. Line 584 relies on the repr function.

Starting in v1.14 and updated in v1.22, numpy has a "legacy" print option that allows use of 1.x style printing.

import numpy as np
print(f'numpy v{np.__version__}')
# Output: numpy v2.0.0
a = np.float64(33.0)
for legacy in [False, '1.25']:
  np.set_printoptions(legacy=legacy)
  print((np.get_printoptions()['legacy'], repr(a), f'{a}', f'{a!s}', f'{a!r}', f'{a!a}', '{}'.format(a)))
# Outputs:
# (False, 'np.float64(33.0)', '33.0', '33.0', 'np.float64(33.0)', 'np.float64(33.0)', '33.0')
# ('1.25', '33.0', '33.0', '33.0', '33.0', '33.0', '33.0')

While repr has changed, f-strings with no ! or with !s or '{}'.format(a) all operate consistently from version to version. So, this leads me to suggest updates either using str.format or f-strings.

import numpy as np
np.set_printoptions(legacy=False)
mapstr_bits = dict(proj='lcc', lat_0=np.float64(40), lon_0=np.float64(-97), lat_1=np.float64(33), lat_2=np.float64(45), x_0=np.float64(2556000), y_0=np.float64(1728000), R=np.float64(6370000), to_meter=np.float64(12000))
mapstr_old = ' '.join(['+%s=%s' % (k, v if isinstance(v, str) else repr(v)) for k, v in mapstr_bits.items()])
mapstr_new1 = ' '.join(['+{}={}'.format(k, v) for k, v in mapstr_bits.items()])
mapstr_new2 = ' '.join([f'+{k}={v}' for k, v in mapstr_bits.items()])
print('# repr:', mapstr_old)
# repr: +proj=lcc +lat_0=np.float64(40.0) +lon_0=np.float64(-97.0) +lat_1=np.float64(33.0) +lat_2=np.float64(45.0) +x_0=np.float64(2556000.0) +y_0=np.float64(1728000.0) +R=np.float64(6370000.0) +to_meter=np.float64(12000.0)
print('# str.format:', mapstr_new1)
# str.format: +proj=lcc +lat_0=40.0 +lon_0=-97.0 +lat_1=33.0 +lat_2=45.0 +x_0=2556000.0 +y_0=1728000.0 +R=6370000.0 +to_meter=12000.0
print('# f-string', mapstr_new2)
# f-string +proj=lcc +lat_0=40.0 +lon_0=-97.0 +lat_1=33.0 +lat_2=45.0 +x_0=2556000.0 +y_0=1728000.0 +R=6370000.0 +to_meter=12000.0

Because str.format is more backward compatible, I would recommend that as a fix.

barronh commented 1 month ago

Other v2 related issues are around recfromtxt, which is removed from numpy in v2. These will be fixed concurrently.

barronh commented 1 month ago

There is a set of fixes that passes all regression tests with numpy v2 and v1.26 in Python3.9 in branch bugfix/proj-numpyv2.

This also addressed the deprecation of tostring, several other uses of repr, and several uses of recfromtxt.