pism / pism

Repository for the Parallel Ice Sheet Model (PISM)
https://pism.io/
GNU General Public License v3.0
102 stars 42 forks source link

Unable to read from a compressed NetCDF file with permuted dimensions #523

Closed ckhroulev closed 8 months ago

ckhroulev commented 1 year ago

Description

PISM uses NetCDF's nc_get_varm_double() to read transposed data, i.e. variables with the order of dimensions that does not match PISM's in-memory storage order. NetCDF has a very old bug in nc_{get,put}_varm_*() (see bug report (nc_put_varm_double hangs during collective parallel I/O)). To avoid hanging PISM switches to the independent parallel access mode using nc_var_par_access().

However, according to NetCDF documentation

In netcdf-c-4.7.4 or later, using hdf5-1.10.2 or later, the zlib, szip, fletcher32, and other filters may be used when writing data with parallel I/O. The use of these filters require collective access. Turning on the zlib (deflate) or fletcher32 filter for a variable will automatically set its access to collective if the file has been opened for parallel I/O. Attempts to set access to independent will return NC_EINVAL.

So now reading transposed data works if a file is not compressed and fails with a confusing error message (see below) if it is.

PISM version

Any recent version, e.g. 2.0.

To Reproduce

$ pismr -verbose 1 -eisII A -y 1000 -o foo.nc
$ ncpdq -a x,y -O -4 -L9 foo.nc foo.nc
$ pismr -i foo.nc
Reading configuration parameters (pism_config) from file '/home/user/local/build/pism/pism_config.nc'.
PISMR (basic evolution run mode) stable v2.0.6-11-g47bdc82e5 committed by Constantine Khrulev on 2023-11-16 13:30:44 -0900
* Run time: [1001-01-01 00.000h, 2001-01-01 00.000h]  (1000.000 years, using the '365_day' calendar)
# Allocating the geometry evolution model...
# Allocating an iceberg remover (part of a calving model)...
# Allocating a stress balance model...
* Initializing bed smoother object with 5.000 km half-width ...
# Allocating an energy balance model...
# Allocating a subglacial hydrology model...
# Allocating a basal yield stress model...
# Allocating a bedrock thermal layer model...
# Allocating a bed deformation model...
# Allocating a surface process model or coupler...
  - Option atmosphere.given.file is not set. Trying the input file 'foo.nc'...
  - Option surface.given.file is not set. Trying the input file 'foo.nc'...
# Allocating sea level forcing...
# Allocating an ocean model or coupler...
initializing 2D fields from NetCDF file 'foo.nc'...
PISM ERROR: NetCDF: Invalid argument
            while reading variable 'lat' from 'foo.nc'
            while reading variable 'lat' from 'foo.nc'

Expected behavior

The run pismr -i foo.nc should succeed.

Additional context

Note that nc_{get,put}_varm_*() functions are deprecated.

I need to remove PISM's code calling nc_get_varm_double(): we should read data "as is" using nc_get_vara_double() and transpose it in memory.

ckhroulev commented 8 months ago

Fixed in the dev branch by bebecad.