CDAT / cdms

9 stars 10 forks source link

cdsm2 cannot open some (big) netcdf files #238

Open doutriaux1 opened 6 years ago

doutriaux1 commented 6 years ago

file is on acme1 and REALLY big (we suspect that might be the issue)

ncdump seems to work

ncdump -h /p/user_pub/work/E3SM/1_0/1950-Control/0_25deg_atm_18-6km_ocean/ocean/native/model-output/mon/ens1/v1/mpaso.hist.am.timeSeriesStatsMonthly.0001-01-01.nc  | more
netcdf mpaso.hist.am.timeSeriesStatsMonthly.0001-01-01 {
dimensions:
    nMerHeatTransBinsP1 = 181 ;
    nMocStreamfunctionBinsP1 = 181 ;
    Time = UNLIMITED ; // (1 currently)
    StrLen = 64 ;
    nCells = 3693225 ;
    nVertLevels = 80 ;
    nEdges = 11135652 ;
    nVertLevelsP1 = 81 ;
[snip]

but not cdms2

import cdms2
f = cdms2.open(""/p/user_pub/work/E3SM/1_0/1950-Control/0_25deg_atm_18-6km_ocean/ocean/native/model-output/mon/ens1/v1/mpaso.hist.am.timeSeriesStatsMonthly.0001-01-01.nc")

/export/doutriaux1/miniconda2/envs/nightly_gcc_linux/lib/python2.7/site-packages/cdms2/dataset.pyc in openDataset(uri, mode, template, dods, dpath, hostObj)
    369 
    370             # The file exists
--> 371             file1 = CdmsFile(path, "r")
    372             if libcf is not None:
    373                 if hasattr(file1, libcf.CF_FILETYPE):

/export/doutriaux1/miniconda2/envs/nightly_gcc_linux/lib/python2.7/site-packages/cdms2/dataset.pyc in __init__(self, path, mode, hostObj, mpiBarrier)
   1096             _fileobj_ = Cdunif.CdunifFile(path, mode)
   1097         except Exception as err:
-> 1098             raise CDMSError('Cannot open file %s (%s)' % (path, err))
   1099         self._file_ = _fileobj_   # Cdunif file object
   1100         self.variables = {}

CDMSError: Cannot open file /p/user_pub/work/E3SM/1_0/1950-Control/0_25deg_atm_18-6km_ocean/ocean/native/model-output/mon/ens1/v1/mpaso.hist.am.timeSeriesStatsMonthly.0001-01-01.nc (No error)

In [3]: cdms2.Cdunif.CdunifFile("/p/user_pub/work/E3SM/1_0/1950-Control/0_25deg_atm_18-6km_ocean/ocean/native/model-output/mon/ens1/v1/mpaso.hist.am.timeSeriesStatsMonthly.0001-01-01.nc")
CDMS I/O error: Determining type of file /p/user_pub/work/E3SM/1_0/1950-Control/0_25deg_atm_18-6km_ocean/ocean/native/model-output/mon/ens1/v1/mpaso.hist.am.timeSeriesStatsMonthly.0001-01-01.nc; must specify dictionary (control) file
---------------------------------------------------------------------------
IOError                                   Traceback (most recent call last)
<ipython-input-3-2366dfaf7b37> in <module>()
----> 1 cdms2.Cdunif.CdunifFile("/p/user_pub/work/E3SM/1_0/1950-Control/0_25deg_atm_18-6km_ocean/ocean/native/model-output/mon/ens1/v1/mpaso.hist.am.timeSeriesStatsMonthly.0001-01-01.nc")

IOError: No error
doutriaux1 commented 6 years ago

originally reported by Sterling

doutriaux1 commented 6 years ago

using cdms2 from cdat8

jypeter commented 6 years ago

@doutriaux1 what do you mean by really big (even if I can see there are many points)? How many Gb? Can you also post the output of ncdump -k mpaso.hist.am.timeSeriesStatsMonthly.0001-01-01.nc? Checking the netcdf type (as recognized by ncdump) this way helps in some cases

doutriaux1 commented 6 years ago

@jypeter apparently it needs the newer version of netcdf. See also: which contains link to data (~75Gb)

https://github.com/conda-forge/libnetcdf-feedstock/issues/42

jypeter commented 6 years ago

That's ridiculously huge indeed! I think ppl are almost certain to run into lots of problems with (too) big files anyway

dnadeau4 commented 6 years ago

@jypeter this link is an april fool's day from 2012. The problem is due to a new signature "C,D,F,5" in the netCDF file.

dnadeau4 commented 6 years ago

ncdump -k mpaso.hist.am.timeSeriesStatsMonthly.0001-01-01.nc cdf5

dnadeau4 commented 6 years ago

This is fixed, but I will keep it open. I need to write a test for it.