COSIMA / access-om2

ACCESS-OM2 global ocean - sea ice coupled model configurations.
21 stars 23 forks source link

Remove static fields from CICE output #201

Open aekiss opened 4 years ago

aekiss commented 4 years ago

All CICE history outputs currently include the following identical static grid data, which is a waste of space, particularly for daily outputs which are one file per day.

    float TLON(nj, ni) ;
        TLON:long_name = "T grid center longitude" ;
        TLON:units = "degrees_east" ;
        TLON:missing_value = 1.e+30f ;
        TLON:_FillValue = 1.e+30f ;
    float TLAT(nj, ni) ;
        TLAT:long_name = "T grid center latitude" ;
        TLAT:units = "degrees_north" ;
        TLAT:missing_value = 1.e+30f ;
        TLAT:_FillValue = 1.e+30f ;
    float ULON(nj, ni) ;
        ULON:long_name = "U grid center longitude" ;
        ULON:units = "degrees_east" ;
        ULON:missing_value = 1.e+30f ;
        ULON:_FillValue = 1.e+30f ;
    float ULAT(nj, ni) ;
        ULAT:long_name = "U grid center latitude" ;
        ULAT:units = "degrees_north" ;
        ULAT:missing_value = 1.e+30f ;
        ULAT:_FillValue = 1.e+30f ;
        ULAT:comment = "Latitude of NE corner of T grid cell" ;
    float NCAT(nc) ;
        NCAT:long_name = "category maximum thickness" ;
        NCAT:units = "m" ;
    float tmask(nj, ni) ;
        tmask:long_name = "ocean grid mask" ;
        tmask:coordinates = "TLON TLAT" ;
        tmask:comment = "0 = land, 1 = ocean" ;
        tmask:missing_value = 1.e+30f ;
        tmask:_FillValue = 1.e+30f ;
    float blkmask(nj, ni) ;
        blkmask:long_name = "ice grid block mask" ;
        blkmask:coordinates = "TLON TLAT" ;
        blkmask:comment = "mytask + iblk/100" ;
        blkmask:missing_value = 1.e+30f ;
        blkmask:_FillValue = 1.e+30f ;
    float tarea(nj, ni) ;
        tarea:long_name = "area of T grid cells" ;
        tarea:units = "m^2" ;
        tarea:coordinates = "TLON TLAT" ;
        tarea:missing_value = 1.e+30f ;
        tarea:_FillValue = 1.e+30f ;
    float uarea(nj, ni) ;
        uarea:long_name = "area of U grid cells" ;
        uarea:units = "m^2" ;
        uarea:coordinates = "ULON ULAT" ;
        uarea:missing_value = 1.e+30f ;
        uarea:_FillValue = 1.e+30f ;
    float dxt(nj, ni) ;
        dxt:long_name = "T cell width through middle" ;
        dxt:units = "m" ;
        dxt:coordinates = "TLON TLAT" ;
        dxt:missing_value = 1.e+30f ;
        dxt:_FillValue = 1.e+30f ;
    float dyt(nj, ni) ;
        dyt:long_name = "T cell height through middle" ;
        dyt:units = "m" ;
        dyt:coordinates = "TLON TLAT" ;
        dyt:missing_value = 1.e+30f ;
        dyt:_FillValue = 1.e+30f ;
    float dxu(nj, ni) ;
        dxu:long_name = "U cell width through middle" ;
        dxu:units = "m" ;
        dxu:coordinates = "ULON ULAT" ;
        dxu:missing_value = 1.e+30f ;
        dxu:_FillValue = 1.e+30f ;
    float dyu(nj, ni) ;
        dyu:long_name = "U cell height through middle" ;
        dyu:units = "m" ;
        dyu:coordinates = "ULON ULAT" ;
        dyu:missing_value = 1.e+30f ;
        dyu:_FillValue = 1.e+30f ;
    float HTN(nj, ni) ;
        HTN:long_name = "T cell width on North side" ;
        HTN:units = "m" ;
        HTN:coordinates = "TLON TLAT" ;
        HTN:missing_value = 1.e+30f ;
        HTN:_FillValue = 1.e+30f ;
    float HTE(nj, ni) ;
        HTE:long_name = "T cell width on East side" ;
        HTE:units = "m" ;
        HTE:coordinates = "TLON TLAT" ;
        HTE:missing_value = 1.e+30f ;
        HTE:_FillValue = 1.e+30f ;
    float ANGLE(nj, ni) ;
        ANGLE:long_name = "angle grid makes with latitude line on U grid" ;
        ANGLE:units = "radians" ;
        ANGLE:coordinates = "ULON ULAT" ;
        ANGLE:missing_value = 1.e+30f ;
        ANGLE:_FillValue = 1.e+30f ;
    float ANGLET(nj, ni) ;
        ANGLET:long_name = "angle grid makes with latitude line on T grid" ;
        ANGLET:units = "radians" ;
        ANGLET:coordinates = "TLON TLAT" ;
        ANGLET:missing_value = 1.e+30f ;
        ANGLET:_FillValue = 1.e+30f ;

Unfortunately I can see no way to just output all the static data to a separate file, as we do with MOM. But many of these can be obtained from the grid.nc input, e.g. /g/data/ik11/inputs/access-om2/input_08022019/cice_01deg/grid.nc so if that were copied to the output directory they could be omitted from the history files.

The ones not in grid.nc are

    float NCAT(nc) ;
        NCAT:long_name = "category maximum thickness" ;
        NCAT:units = "m" ;
    float tmask(nj, ni) ;
        tmask:long_name = "ocean grid mask" ;
        tmask:coordinates = "TLON TLAT" ;
        tmask:comment = "0 = land, 1 = ocean" ;
        tmask:missing_value = 1.e+30f ;
        tmask:_FillValue = 1.e+30f ;
    float blkmask(nj, ni) ;
        blkmask:long_name = "ice grid block mask" ;
        blkmask:coordinates = "TLON TLAT" ;
        blkmask:comment = "mytask + iblk/100" ;
        blkmask:missing_value = 1.e+30f ;
        blkmask:_FillValue = 1.e+30f ;
    float dxt(nj, ni) ;
        dxt:long_name = "T cell width through middle" ;
        dxt:units = "m" ;
        dxt:coordinates = "TLON TLAT" ;
        dxt:missing_value = 1.e+30f ;
        dxt:_FillValue = 1.e+30f ;
    float dyt(nj, ni) ;
        dyt:long_name = "T cell height through middle" ;
        dyt:units = "m" ;
        dyt:coordinates = "TLON TLAT" ;
        dyt:missing_value = 1.e+30f ;
        dyt:_FillValue = 1.e+30f ;
    float dxu(nj, ni) ;
        dxu:long_name = "U cell width through middle" ;
        dxu:units = "m" ;
        dxu:coordinates = "ULON ULAT" ;
        dxu:missing_value = 1.e+30f ;
        dxu:_FillValue = 1.e+30f ;
    float dyu(nj, ni) ;
        dyu:long_name = "U cell height through middle" ;
        dyu:units = "m" ;
        dyu:coordinates = "ULON ULAT" ;
        dyu:missing_value = 1.e+30f ;
        dyu:_FillValue = 1.e+30f ;

and I think these could just be omitted. Any objections?

aekiss commented 4 years ago

An alternative to omitting static grid data would be to concatenate the individual daily files into one file per month, so that there'd be only one copy of the static data per month, rather than ~30. This doesn't save quite as much space, but it's close if daily data is output. This is what we did in 01deg_jra55v13_iaf with the script concat_ice_dailies-ALL.sh which submits concat_ice_dailies.sh for each month in the archive that contains unconcatenated daily outputs (see ~aek156/raijin_home/aek156/payu/01deg_jra55v13_iaf/).

However, this would need to be integrated into the sync workflow triggered by the postscript entry in config.yaml, so that this concatenation takes place before the output is synched to ik11. This is kind of tricky, so I'd prefer to just omit the static data and keep the daily outputs unconcatenated.

aidanheerdegen commented 4 years ago

You could also do the concatenation with an archive or run user command/script

https://payu.readthedocs.io/en/latest/config.html#postprocessing

Using run is attractive, as the paths to the files you want to concatenate are always the same. I don't know how long it takes to run, if only a few seconds it would be ok.

aekiss commented 4 years ago

Thanks for the tip - sounds like that could work.

aekiss commented 4 years ago

I tried removing these fields in an 0.25deg run. It reduced the file size by only 7Mb per monthly file (135MB down to 128Mb, ie a 5% reduction), I guess because they are very compressible fields. They'd add up to ~5Gb over 60 years at 0.25deg. At 0.1deg it could be ~30Gb over 60 years with monthly outputs, and ~900Gb over 60 years with daily output. ~This isn't much, and although I don't know of anyone who uses them (mainly because I haven't checked) it seems harmless enough just to leave them in?~ angle anglet dxt dxu dyt dyu hte htn tarea tmask uarea

aekiss commented 4 years ago

I've amended my previous post - this will save ~900Gb for 60 yrs of daily 0.1deg data so is worthwhile.

Note that there's no namelist option to remove TLON, TLAT, ULON, ULAT so these would be retained... although in practice (e.g. https://github.com/COSIMA/ACCESS-OM2-1-025-010deg-report/blob/master/figures/ice_timeseries/ice_timeseries.ipynb) we usually use xt_ocean, yt_ocean etc from MOM output as it doesn't have pasky nans on land.

We also use cell area data, but typically use MOM's area_t rather than CICE's tarea, again because of nans in tarea. Nevertheless others may find it helpful if we keep tarea (and probably uarea also).

ofa001 commented 4 years ago

HI Andrew They are such a small amount in the overall data percentage its worth keeping it, as it can be used by different plotting software as I said before

Siobhan .

From: Andrew Kiss notifications@github.com Sent: Thursday, 4 June 2020 2:00 PM To: COSIMA/access-om2 access-om2@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [COSIMA/access-om2] Remove static fields from CICE output (#201)

I've amended my previous post - this will save ~900Gb for 60 yrs of daily 0.1deg data so is worthwhile.

Note that there's no namelist option to remove TLON, TLAT, ULON, ULAT so these would be retained... although in practice (e.g. https://github.com/COSIMA/ACCESS-OM2-1-025-010deg-report/blob/master/figures/ice_timeseries/ice_timeseries.ipynb) we usually use xt_ocean, yt_ocean etc from MOM output as it doesn't have pasky nans on land.

We also use cell area data, but typically use MOM's area_t rather than CICE's tarea, again because of nans in tarea. Nevertheless others may find it helpful if we keep tarea (and probably uarea also).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/COSIMA/access-om2/issues/201#issuecomment-638588310, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADNNDAX7ZDOCWJ43NVPPTLTRU4L5PANCNFSM4NDUFK3A.