plesager / ece3-postproc

Suite of processing tools for EC-Earth3 output
5 stars 8 forks source link

icediags not working after shaconemo updates #29

Closed etiennesky closed 5 years ago

etiennesky commented 6 years ago

Hi, I ran into this problem with recent trunk on CCA and MN4. Somehow, the _icemod.nc output file is quite different from the other files. This causes cdficediags to fail

/usr/local/apps/cdftools/3.0/bin/cdficediags a13s_1m_18500101_18501231_icemod.nc -lim3
 NetCDF: Invalid dimension ID or name                                            
  Exact dimension name x not found in a13s_1m_18500101_18501231_icemod.nc
cp icediags.nc /scratch/ms/spesiccf/c3et/a13s/18500101/fc0/runtime/ece3post/post/mon/Post_1850/a13s_1850_icediags.nc
cp: cannot stat `icediags.nc': No such file or directory

I did not see this on our other platform, mn4. I think Klaus also found the same issue and reported it in the ece2cmor3 tracker. https://github.com/EC-Earth/ece2cmor3/issues/174

klauswyser commented 6 years ago

Indeed, I had to change the name of the x/y dimensions in the icemod file to x/y, see https://github.com/plesager/barakuda/pull/1/commits/fe2ea424763cb1a60df34b3784f7e2bd3ec64e3c

The reason is that with the newest file_def there are fields on the T-grid, on the U-grid and on the V-grid in the same icemod file, and xios therefore makes the grid dimensions unique.Uwe tried and removed the ice velocities, in that case you get the x/y dimensionsions as before, but because we need ice velocities for CMIP6 we cannot do that. Therefore we now have a quick fix to rename the x_grid_T/y_grid_T dimensions in Barakuda. Something similar is needed in emop (I have a local edit, not commited) and most likely also for hiresclim2.

etiennesky commented 6 years ago

Thanks @klauswyser working on this fix right now, will do a larger MR shortly.

etiennesky commented 6 years ago

Hi @klauswyser unfortunately it does not work for me on CCA, using nco/4.6.1 and also nco/4.6.7

c3et@cca-login4:/scratch/ms/spesiccf/c3et/tmp_ecearth3/tmp/hireclim2_a13s_TVuYwT> ncrename -d .x_grid_T,x -d .y_grid_T,y a13s_1m_18500101_18501231_icemod.nc tmp1.nc
ncdump -h tmp1.ncncrename: In total renamed 0 attributes, 2 dimensions, 0 groups, and 0 variables
c3et@cca-login4:/scratch/ms/spesiccf/c3et/tmp_ecearth3/tmp/hireclim2_a13s_TVuYwT> ncdump -h tmp1.nc
ncdump: tmp1.nc: NetCDF: HDF error
plesager commented 6 years ago

@etiennesky this look very much like an issue with the version of netcdf you are using. I use an alias:

type ncdump ncdump is aliased to `/opt/cray/netcdf/4.3.0/bin/ncdump'

etiennesky commented 6 years ago

Hi @plesager my ncdump point to the same as yours. Also other netcdf commands (cdo, etc) fail on that same file. And I get the same problem on our BSC machines.

The following works for me:

ncks -3 ${froot}_icemod.nc ${froot}_icemod_tmp.nc
ncrename -O -d .x_grid_T,x -d .y_grid_T,y ${froot}_icemod_tmp.nc ${froot}_icemod.nc
rm -f ${froot}_icemod_tmp.nc
plesager commented 6 years ago

Ok , from the nco manual:

Caveat lector: Unforunately from 2007–present (February, 2018) the netCDF library (versions 4.0.0–4.6.0) contains bugs or limitations that sometimes prevent NCO from correctly renaming coordinate variables, dimensions, and groups in netCDF4 files. (To our knowledge the netCDF library calls for renaming always work well on netCDF3 files so one workaround to many netCDF4 issues is convert to netCDF3, rename, then convert back).

Going through netCDF3 is probably the safest and more portable we can do. But we should convert back to netCDF4-classic so compression is available (we can use $cdo copy for that instead of just cp on the line right after calling $cdftoolsbin/cdficediags).

plesager commented 6 years ago

@etiennesky, on cca, I cannot use ncrename from nco/4.6.7 on the netCDF3 file. But if I switch to the older nco/4.3.7, then it works. Was your test on BSC machine?

plesager commented 6 years ago

Fixed in 687a7ad.

klauswyser commented 6 years ago

Sorry for not replying earlier. I had the very same issue with ncrename on our CRAY but not on our Linux cluster. For Barakuda I have solved it with cdo: Barakuda (and probably hiresclim2) actually makes a copy of the NEMO files in your tmpdir with rsync (this is needed to adjust some variablenames, legacy from older NEMO versions and cdftools). For the icemod and SBC files, I have replaced rsync by cdo -f nc and then everything just works fine.

etiennesky commented 6 years ago

Hi, I will test this fix in our MN4 machine as well as on CCa. I had tested only the fix, not the complete hiresclim (with ncrename and what not) on MN4. And I did not run into your problems on CCA, using my fix only hiresclim2 ran fine.

etiennesky commented 6 years ago

Hi again @plesager I see that you put the fix inside the if (( $newercdftools )) block. Which means that the error might appear when using $newercdftools . have you tested it?

plesager commented 6 years ago

Correct. Not tested, need to switch machine for that. But further testing show that the timeseries tool is also broken. So I am testing something else altogether. Stay tuned...

etiennesky commented 6 years ago

FYI I had put the previous lines before the $newercdftools block and it worked for me on CCA.

plesager commented 6 years ago

See last commit.

plesager commented 6 years ago

Indirect consequence of switching to netCDF3 format: I am not able to apply cdo -f nc4 on the resulting file, and hence no compression available. Although this may probably be specific to my installation (I got a hdf5 write error), I pushed a simple fix (88cf794) that lets me run cdo -f nc4c -z zip on the file with renamed dimensions. It should not affect anybody else.

Correction: correct fix is commit e923e0b

etiennesky commented 6 years ago

currently nco/4.3.7 is loaded, but ncks -7 is not supported... we should either use nco/4.6.7 or save to netcdf4 format (not netcdf4-classic)

    ncks -7 -O ${froot}_icemod_tmp2.nc ${froot}_icemod.nc
    rm -f ${froot}_icemod_tmp.nc ${froot}_icemod_tmp2.nc
fi
cdo setmissval: Processed 8879136 values from 7 variables over 12 timesteps ( 0.39s )
ncks: invalid option -- '7'
etiennesky commented 6 years ago

Somehow I cannot re-open this issue @plesager

plesager commented 5 years ago

we should either use nco/4.6.7 or save to netcdf4 format (not netcdf4-classic)

nco/4.6.7 is not an option, since ncrename does not work. Let's try the full netcdf4 format.

plesager commented 5 years ago

Replacing -7 with -4 works on rhino.

etiennesky commented 5 years ago

Why use the netcdf4 format for this variable, when other variables are in netcdf3 format (if I am not mistaken) ? Since we are already doing a cdo setmissval on the file, an additional ncks is just a waste of time.

This fix works for us on marenostrum4 and cca

-    $cdo setmissval,0 ${froot}_icemod_tmp.nc ${froot}_icemod_tmp2.nc
-    ncks -7 -O ${froot}_icemod_tmp2.nc ${froot}_icemod.nc
+    rm -f ${froot}_icemod_tmp.nc ${froot}_icemod_tmp2.nc
+    $cdo setmissval,0 ${froot}_icemod_tmp.nc ${froot}_icemod.nc
+    rm -f ${froot}_icemod_tmp.nc
plesager commented 5 years ago

I tried your way, and got again the HDF5 error I mentioned above:

Error (cdf_put_att_text) : NetCDF: String match to name in use HDF5-DIAG: Error detected in HDF5 (1.8.18) thread 0:

000: H5A.c line 1640 in H5Aexists(): not a location

major: Invalid arguments to routine minor: Inappropriate type

001: H5Gloc.c line 195 in H5G_loc(): invalid group ID

major: Invalid arguments to routine minor: Bad value Error (cdf_close) : NetCDF: HDF error cdf_put_att_text : ncid = 131072 varid = -1 att = _NCProperties text = version=1|netcdflibversion=4.4.1.1|hdf5libversion=1.8.18

This is happening (and I have no idea why) in the $cdozip line in this block: # ** ice diagnostics tempf=$(mktemp $SCRATCH/tmp_ecearth3/tmp/hireclim2_nemo_XXXXXX) $cdo selvar,iiceconc,iicethic ${froot}_icemod.nc $tempf $cdozip splitvar $tempf ${out}_ rm -f $tempf

klauswyser commented 5 years ago

You could try saving $tempf in plain netCDF (2?) format:

$cdo -f nc selvar,iiceconc,iicethic ${froot}_icemod.nc $tempf
plesager commented 5 years ago

Tried but no success. I've tried several workarounds. Bottom line: my cdo cannot output (whatever the operator, even a simple copy) ${froot}_icemod.nc into nc4 or nc4c format after ncrename has been applied on its dimensions. That means that the following fails:

- $cdo setmissval,0 ${froot}_icemod_tmp.nc ${froot}_icemod.nc
+ $cdo -f nc4 setmissval,0 ${froot}_icemod_tmp.nc ${froot}_icemod.nc

I tried another version of cdo (1.9.3), which is installed on the system. And now ifs_monthly.sh crashes, because of the different output from cdo showtime.... (Never ending story, when do we rewrite the all thing in python?) For lack of time, I do not have too many choices: either we use ncks to reformat the file or we do not compress the output (two occurrences) :

- $cdozip splitvar $tempf ${out}_
+ $cdo splitvar $tempf ${out}_
...
-    $cdozip selvar,iiceconc,iicethic ${froot}_icemod.nc ${out}_ice.nc    
+    $cdo selvar,iiceconc,iicethic ${froot}_icemod.nc ${out}_ice.nc

Any preference?

plesager commented 5 years ago

Went with the ncks solution. See commit 0da525106f689e5b263c95e14e2e418719ea33cc