ACCESS-NRI / accessdev-Trac-archive

Archive accessdev Trac contents as issues
Apache License 2.0
0 stars 0 forks source link

MOM should use netCDF-4 with compression #360

Closed penguian closed 5 years ago

penguian commented 6 years ago

resolution_fixed | by mrd599@nci.org.au


At the moment both restart and history files are netCDF-3 format

% ncdump -k /short/p66/dhb599/archive/av630/restart/ocn/ocean_temp_salt.res.nc-02010630
classic

%  ncdump -k /short/p66/dhb599/archive/av630/history/ocn/ocean_month.nc-02010630 
64-bit offset

Using netCDF4 with even the lowest level compression reduces the size by about 2/3.

% nccopy -k 3 -d 1 ocean_month.nc-02010630 ocean_month.nc4

reduces size from 3605 MB to 1054 MB

% nccopy -k 3 -d 1 ocean_temp_salt.res.nc-02010630 ocean_temp_salt.res.nc4

reduces size from 82 MB to 37 MB.


Issue migrated from trac:360 at 2024-01-31 18:33:52 +1100

penguian commented 6 years ago

@martin.dix@anu.edu.au commented


It should be simple to change the model and mppcombine to use compressed netCDF4. Alternatively the files could just be rewritten in a post-processing step. Using netCDF4-classic with compression should be transparent to MOM when it reads restart files.

Note that this compression is lossless.

penguian commented 6 years ago

@aidan.heerdegen@anu.edu.au commented


This has been done already, and used routinely in production runs in ocean/ice simulations.

FMS has the ability to save to netCDF4 already, it just needs to be specified at compile time. In the MOM5 compile script

https://github.com/mom-ocean/MOM5/blob/43986e20236cc74b83cd5ee09ebd798f4da40824/exp/MOM_compile.csh

It is exposed as a command line option --use_netcdf4. Once you have an executable which is writing netCDF4 files the following options added to input.nml set the compression parameters:

#!fortran
 &mpp_io_nml
    deflate_level = 5
    shuffle = 1
/

I recommend to use shuffle, it can result in up to 10% improvement "for free". Similarly I found a deflate level of 4 or 5 to be the sweet spot in terms of compression, but your mileage may vary as they say.

This will nicely compress the restarts and the outputs. If your outputs are tiled, then they will be output again by mppnccombine. We have a version with compression enabled here:

https://github.com/mom-ocean/MOM5/blob/43986e20236cc74b83cd5ee09ebd798f4da40824/src/postprocessing/mppnccombine/mppnccombine.c

It requires command line options to enable compression. Specifically -n4 -z -d 5 is the full option list, but this is equivalent to -z as the default deflate level is 5, shuffle is on by default, and netCDF4 output should be enabled if compression is specified.

penguian commented 6 years ago

@aidan.heerdegen@anu.edu.au changed _comment0 which not transferred by tractive

penguian commented 6 years ago

@aidan.heerdegen@anu.edu.au changed _comment1 which not transferred by tractive

penguian commented 6 years ago

@martin.dix@anu.edu.au commented


Coupled model suites load the default netcdf module which is 4.2.1.1.

MOM_compile.csh in access_cm2_drivers has

set cppDefs  = ( "-Duse_netCDF -Duse_netCDF3 -Duse_libMPI -DACCESS -DACCESS_CM" )

Modified this to have -Duse_netCDF3.

Built ~access/access_cm2/utils/mppnccombine_nc4 from the version described above and modified mppcombine.sh to use ~access/access-cm2/utils/mppnccombine_nc4 -n4 -z -v -r ...

Use rev 544 of access-cm2-drivers to get these updates.

penguian commented 6 years ago

@martin.dix@anu.edu.au changed _comment0 which not transferred by tractive

penguian commented 6 years ago

@martin.dix@anu.edu.au changed _comment1 which not transferred by tractive

penguian commented 6 years ago

@martin.dix@anu.edu.au commented


Tested in a branch nc4 of suite u-aq959. One month runs

% ls -l aq959/history/ocn/
total 954436
-rw-r-----  1 mrd599 p66  26809712 Mar  9 15:40 ocean_daily.nc-00010131
-rw-r-----  1 mrd599 p66 950492160 Mar  9 15:40 ocean_month.nc-00010131
-rw-r-----+ 1 mrd599 p66     28024 Mar  9 15:36 ocean_scalar.nc-00010131
% ls -l aq959-nc4/history/ocn/
total 258252
-rw-r-----  1 mrd599 p66  11830684 Mar  9 15:10 ocean_daily.nc-00010131
-rw-r-----  1 mrd599 p66 252472829 Mar  9 15:11 ocean_month.nc-00010131
-rw-r-----+ 1 mrd599 p66    132629 Mar  9 15:03 ocean_scalar.nc-00010131

% du -h aq959/restart/ocn/
1.1G    aq959/restart/ocn/
% du -h aq959-nc4/restart/ocn/
320M    aq959-nc4/restart/ocn/

nccmp shows that files are identical (apart from format).

nccmp -s -w format -d aq959/history/ocn/ocean_daily.nc-00010131 aq959-nc4/history/ocn/ocean_daily.nc-00010131
DIFFER : FILE FORMATS : NC_FORMAT_64BIT <> NC_FORMAT_NETCDF4_CLASSIC
Files "aq959/history/ocn/ocean_daily.nc-00010131" and "aq959-nc4/history/ocn/ocean_daily.nc-00010131" are identical.

Suite changes are https://code.metoffice.gov.uk/trac/roses-u/changeset?reponame=&new=71115%40a%2Fq%2F9%2F5%2F9%2Fnc4%2Fapp&old=71108%40a%2Fq%2F9%2F5%2F9%2Fnc4%2Fapp

Essentially, update revision in app/fcm_make_drivers/rose-app.conf and add a new namelist mpp_io_nml to the MOM runtime configuration.

penguian commented 6 years ago

@martin.dix@anu.edu.au changed _comment0 which not transferred by tractive

penguian commented 6 years ago

@martin.dix@anu.edu.au changed _comment1 which not transferred by tractive

penguian commented 5 years ago

@martin.dix@anu.edu.au changed status from new to closed

penguian commented 5 years ago

@martin.dix@anu.edu.au set resolution to fixed