gotm-model / code

Source code for the General Ocean Turbulence Model
https://gotm.net
GNU General Public License v2.0
53 stars 44 forks source link

z and zi depend on time, lat, lon in netcdf outpout #45

Open bderembl opened 4 months ago

bderembl commented 4 months ago

Hello,

this is a quick note to say that in the output file, z and zi are function of time, lat and lon. I don't know if this is the intended behavior. I seems to me that it should not since I read in gotm.F90


   call fm%register_dimension('lon',1,id=id_dim_lon)
   call fm%register_dimension('lat',1,id=id_dim_lat)
   call fm%register_dimension('z',nlev,id=id_dim_z)
   call fm%register_dimension('zi',nlev+1,id=id_dim_zi)
   call fm%register_dimension('time',id=id_dim_time)

and when I do ncdump -h entrainment.nc

I see

        float z(time, z, lat, lon) ;
                z:units = "m" ;
                z:long_name = "depth (center)" ;
                z:standard_name = "??" ;
                z:path = "/column_structure" ;
                z:positive = "up" ;
                z:axis = "Z" ;
        float zi(time, zi, lat, lon) ;
                zi:units = "m" ;
                zi:long_name = "depth (interface)" ;
                zi:standard_name = "??" ;
                zi:path = "/column_structure" ;
                zi:positive = "up" ;
                zi:axis = "Z" ;

This is really a minor issue but in ncview my pointer does not show the correct depth: it shows a time instead.

Is there any chance this could be corrected? It seems that this is handled by flexout so it might not be the best place to submit this issue?

Thank you

bolding commented 4 months ago

Hello @bderembl

The first 4 lines defines dimensions - giving both a name, size and id.

The next code section defines coordinates. In simple cases - and in early NetCDF days - there was a convention that variables with the same name as a dimension would be a coordinate variable and they would have a rank of 1. This is likely still valid in many cases - but it is not mandatory - and not very flexible. In the case of GOTM where you can have time varying layer heights - the coordinate for e.g. the center of the grid boxes must be time varying.

A NetCDF variable in GOTM looks like - float tke(time, zi, lat, lon). It is correct that - if - the vertical coordinate for a variable is time varying ncview will not be able to diplay it correctly.

Jorn has developed pyncview - and it knows the inner workings of GOTM.

Temperature from the Liverpool Bay standard case image

Yo can install pyncview via: pip install pyncview.

Karsten

bderembl commented 4 months ago

makes sense! thank you

legaya commented 3 weeks ago

Hello @bolding, I understand your answer but I think one related consequence to that is that it's impossible to directly open a netcdf file created by GOTM with the widely used library xarray in Python. The xarray command "xr.open_dataset('my_profile.nc')" returns me the following error:

MissingDimensionsError: 'z' has more than 1-dimension and the same name as one of its dimensions ('time', 'z', 'lat', 'lon'). xarray disallows such variables because they conflict with the coordinates used to label dimensions.

Do you see a way of correcting that for allowing the users to directly use xarray ? Thanks in advance, Alexandre

TobiasKAndersen commented 3 weeks ago

Hi @legaya

I use GOTM coupled to WET via FABM and process netcdf model results via xarray in Python. Due to xarray's decision to follow netcdf conventions, you need to include the variable drop_variables when you want to open GOTM netcdf output. For instance xr.open_dataset(model_output_folder+"output.nc", drop_variables=['z','zi'])

Cheers, Tobias

bolding commented 3 weeks ago

Hi Alexandre

Not to blame xarray - but :-)

I know Jorn once looked more into this and found some information on xarray acknowledging that there is an issue taht should by dealt with by - xarray.

I can't fine the information right now - but maybe when Jorn is back from a meeting (in some days) he can shed more light on the problem.

Karsten

bderembl commented 3 weeks ago

stepping back into this issue. An easy way to follow netcdf convention would be to use a different name for z ( dimension) and z (data)? same for zi?

bolding commented 3 weeks ago

but are we breaking any NetCDF conventions?

bolding commented 3 weeks ago

from https://cfconventions.org/cf-conventions/cf-conventions.html#coordinate-types section 5

This applies to the COARDS convention

Any of a variable’s dimensions that is an independently varying latitude, longitude, vertical, or time dimension (see Section 1.3, "Terminology") and that has a size greater than one must have a corresponding coordinate variable, i.e., a one-dimensional variable with the same name as the dimension (see examples in Chapter 4, Coordinate Types). This is the only method of associating dimensions with coordinates that is supported by [COARDS].

and this to the CF-convention

Any longitude, latitude, vertical or time coordinate which depends on more than one spatiotemporal dimension must be identified by the coordinates attribute of the data variable.

So it is allowed but requires an appropriate - coordinate - attribute

Karsten

bolding commented 3 weeks ago

further information - where the issue is discussed for GOTM https://mail.google.com/mail/u/0/#search/label%3Agotm-users+xarray/FMfcgzGlkjZKBSnXKrBBJCQfZwrWpTlV

and in the xarray community https://github.com/pydata/xarray/issues/2368

TobiasKAndersen commented 3 weeks ago

Yeah, no it was not to say that GOTM breaks any CF conventions. Sorry about that!

It also does not help that xarray terminology differ from CF : https://docs.xarray.dev/en/latest/user-guide/data-structures.html#coordinates

I guess the challenge lies with xarray as many other models with netcdf output has similar problems...

legaya commented 3 weeks ago

Thanks for this interesting discussion and your solution @TobiasKAndersen !