COSIMA / cosima-cookbook

Framework for indexing and querying ocean-sea ice model output.
https://cosima-recipes.readthedocs.io/en/latest/
Apache License 2.0
58 stars 25 forks source link

Repeated indexing failure on file #242

Closed aidanheerdegen closed 3 years ago

aidanheerdegen commented 3 years ago

There is a problem when indexing a file

https://accessdev.nci.org.au/jenkins/blue/organizations/jenkins/COSIMA%2FCC-Database%20Build/detail/CC-Database%20Build/302/pipeline#log-431

ERROR:root:Error indexing /g/data/ik11/outputs/access-om2-01/01deg_jra55v13_ryf9091/output1079/ocean/ocean_daily_3d_u.nc: NetCDF: HDF error

and it repeatedly fails, but there is nothing apparently wrong with the file.

Accessing the metadata as is done in the indexing step seems to work fine:

import netCDF4

f = '/g/data/ik11/outputs/access-om2-01/01deg_jra55v13_ryf9091/output1079/ocean/ocean_daily_3d_u.nc'

with netCDF4.Dataset(f, "r") as ds:
    for v in ds.variables.values():
        print(v.name, v.dimensions, v.chunking())
        for att in v.ncattrs():
            print(v.getncattr(att))
    for att in ds.ncattrs():
        print(ds.getncattr(att))

produces this with no error:

xu_ocean ('xu_ocean',) [3600]
ucell longitude
degrees_E
X
yu_ocean ('yu_ocean',) [2700]
ucell latitude
degrees_N
Y
st_ocean ('st_ocean',) [75]
tcell zstar depth
meters
Z
down
st_edges_ocean
st_edges_ocean ('st_edges_ocean',) [76]
tcell zstar depth edges
meters
Z
down
time ('time',) [512]
time
days since 1900-01-01 00:00:00
T
NOLEAP
NOLEAP
time_bounds
nv ('nv',) [2]
vertex number
none
N
u ('time', 'st_ocean', 'yu_ocean', 'xu_ocean') [1, 7, 300, 400]
i-current
m/sec
[-10.  10.]
-1e+20
-1e+20
time: mean
average_T1,average_T2,average_DT
geolon_c geolat_c
sea_water_x_velocity
average_T1 ('time',) [512]
Start time for average period
days since 1900-01-01 00:00:00
1e+20
1e+20
average_T2 ('time',) [512]
End time for average period
days since 1900-01-01 00:00:00
1e+20
1e+20
average_DT ('time',) [512]
Length of average period
days
1e+20
1e+20
time_bounds ('time', 'nv') [1, 2]
time axis boundaries
days
1e+20
1e+20
ocean_daily_3d_u.nc
ACCESS-OM2-01
mosaic
1
AndyHoggANU commented 3 years ago

OK, very weird. Not quite sure what we can do here?

aidanheerdegen commented 3 years ago

I'll do some testing now

aidanheerdegen commented 3 years ago

Indexed fine when I did a test with just a few directories. I wondered if there was an issue with the sheer number of files in that experiment, but it is half that of some of the other experiments:

nfiles experiment
12671  01deg_jra55v13_ryf9091
25198  01deg_jra55v140_iaf_cycle2
26168  01deg_jra55v140_iaf
claireyung commented 3 years ago

Hey Aidan, I checked on vdi and looks like the output 01deg_jra55v13_ryf9091/output1079/ocean/ocean_daily_3d_u.nc is now listed on the Cosima cookbook database and seems to load as far as I can tell. I can't check my computation without Gadi but I think that resolves this issue (though still weird why it happened)

aidanheerdegen commented 3 years ago

Thanks for pointing that out @claireyung

You're right, it does seem to be indexed correctly. Not quite sure why that it now working, but I'll close this ticket.