ESMValGroup / ESMValTool

ESMValTool: A community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP
https://www.esmvaltool.org
Apache License 2.0
210 stars 122 forks source link

NorESM2 model output: Dimensions mismatch between areacello and sea ice fields #3578

Open TomasTorsvik opened 2 months ago

TomasTorsvik commented 2 months ago

Describe the bug There is a mismatch between between dimensions for the ocean and sea ice components in NorESM2-LM and NorESM2-MM. Ocean fields are defined on a 360x385 grid, whereas sea ice use a 360x384 grid. The background for this difference is explained in the NorESM2 documentation: https://noresm-docs.readthedocs.io/en/noresm2/faq/postp_plotting_faq.html#different-sea-ice-and-ocean-grid

As a result, the areacello variable is not consistent with any of the sea ice fields. To make it consistent, it is necessary to remove the final j-row of areacello. When loading area data with xarray, I get a consistent area definition by doing the following:

# Ocean Area
Aocn = xarray.open_dataset('<path-to-areacello>')
# Sea ice Area: select coordinate values j = 1..384
Aice = Aocn.sel(j=range(1,385))

Would it be possible to implement a fix in ESMValTool to correct Ofx datasets when used in combination with sea ice output for NorESM2? Or could there be better ways to handle this issue?

valeriupredoi commented 1 month ago

cheers for pointing this out @TomasTorsvik :beer: We should get one such fix in esmvalcore's fixes - can you help by creating one perhaps? Or perhaps a more general approach to de-haloing (this is, in fact, a halo removal type of issue), what say you @ESMValGroup/technical-lead-development-team

TomasTorsvik commented 1 month ago

@valeriupredoi - Thanks for responding. I'm happy to help in any way I can, but I'm not sure how to implement such a fix. I suppose the fixes should go in the files ESMValCore/esmvalcore/cmor/_fixes/cmip6/noresm2_lm.py and ESMValCore/esmvalcore/cmor/_fixes/cmip6/noresm2_mm.py? As I see it, Ofx datasets should be adjusted to remove the halo, but only when used in combination with SIday or SImon datasets. I see the siage dataset (to take an example file) contains global attributes

    :external_variables = "areacello" ;
    :realm = "seaIce" ;

So I'm thinking the fix could be made conditional on these attributes in some way, either by replacing the external_variables with one where the halo has been removed. Would it be possible to define a new areacello file where the halo has been removed, to be used only for the seaIce realm, or apply a fix on the areacello dataset before it is applied to the seaIce field?

bouweandela commented 1 month ago

@TomasTorsvik Do you have any connection with the NorESM2 developers? The simplest solution would be if they just uploaded the sea ice areacello file to ESGF as well, e.g. under the SImon table id instead of Ofx. Would that be an option?

TomasTorsvik commented 1 month ago

@bouweandela - Thanks for your suggestion. I'm involved in NorESM2 development, although not directly with production of CMIP6 data. I will make a request to upload the sea ice areacello file to ESGF.

TomasTorsvik commented 1 month ago

 @bouweandela , @valeriupredoi - I contacted the people working with CMIP6 for NorESM2, but the response was not very positive regarding the suggestion to include areacello for sea ice under SImon. There is concern that having two different areacello files for the same experiment may cause conflicts and confusion for data users. So it seems that having a second areacello file is not an option.

valeriupredoi commented 1 month ago

Well worth a try and many thanks for doing that @TomasTorsvik 🍺 I kind of expected that tbf (multiple reasons). Let's work on a fix that removes the halo (and prob make that available to other such cases) 👍

bouweandela commented 1 month ago

There is concern that having two different areacello files for the same experiment may cause conflicts and confusion for data users.

Great to hear that they are so considerate about not confusing the users of their data.

I suppose the fixes should go in the files ESMValCore/esmvalcore/cmor/_fixes/cmip6/noresm2_lm.py and ESMValCore/esmvalcore/cmor/_fixes/cmip6/noresm2_mm.py?

yes

As I see it, Ofx datasets should be adjusted to remove the halo, but only when used in combination with SIday or SImon datasets.

Unfortunately, such a conditional removal is not possible in the current framework. Would it be an option to remove the halo for all variables that have it? Or would that cause other problems?

Would it be possible to define a new areacello file where the halo has been removed, to be used only for the seaIce realm, or apply a fix on the areacello dataset before it is applied to the seaIce field?

Yes, but only if such a new file would be uploaded to ESGF.

TomasTorsvik commented 1 month ago

As I see it, Ofx datasets should be adjusted to remove the halo, but only when used in combination with SIday or SImon datasets.

Unfortunately, such a conditional removal is not possible in the current framework. Would it be an option to remove the halo for all variables that have it? Or would that cause other problems?

All 2D/3D output from the ocean model component includes the halo. It might be possible technically to remove the halo from all ocean variables, but it seems like a massive undertaking to do this for all CMIP6 experiments and upload new files to ESGF (especially since the current files are working just fine for the ocean part).

bouweandela commented 1 month ago

I meant: write a fix in ESMValCore that removes the halo (unconditionally). That should be fairly straightforward.

TomasTorsvik commented 3 weeks ago

@valeriupredoi , @bouweandela - maybe you can say if I'm on the right track on this? I have done some testing for removing the "halo" column from ocean fields. At the moment I'm using the following function, which works with the 2D/3D/4D variables that I have looked at (specifically areacello, tos, thetao).

def remove_halo(cube):
    num_dim_coords = len(cube.dim_coords)
    # if present, "cell index along second dimension" should be second to last dimension
    if cube.dim_coords[num_dim_coords-2].long_name == 'cell index along second dimension':
        print("found second dimension")
        if cube.dim_coords[num_dim_coords-2].shape[0] == 385:
            print("removing halo cells")
            if num_dim_coords==2:
                cube_no_halo = cube[0:-1, :]
                cube = cube_no_halo
            if num_dim_coords==3:
                cube_no_halo = cube[:, 0:-1, :]
                cube = cube_no_halo
            if num_dim_coords==4:
                cube_no_halo = cube[:, :, 0:-1, :]
                cube = cube_no_halo
    return cube

This solution makes the assumptions that the halo-column is always second to last, has the shape (385, ), and is only 2D/3D/4D dimensions. Would be nice to make something more general, but I haven't figured out how to do this.

bouweandela commented 3 weeks ago

You could probably do something like the following to make it a bit more widely applicable:

if cube.coords('latitude'):
    lat = cube.coord('latitude')
    if lat.shape == (385, 360):
        halo_dim = cube.coord_dims(lat)[0]
        slices = tuple(slice(0, -1) if i == halo_dim else slice(None) for i in range(cube.ndim))
        cube = cube[slices]