eurec4a / how_to_eurec4a

Code examples to get you started with EUREC⁴A data.
https://howto.eurec4a.eu
MIT License
6 stars 20 forks source link

Details about EUREC4A-MIP boundary conditions #94

Open observingClouds opened 1 year ago

observingClouds commented 1 year ago

This issue addresses a remaining discussion of #84

The section on boundary conditions and pseudo-global warming data is currently missing some details that need to be added to make the available output useful for new users.

observingClouds commented 1 year ago

@torresalavez could you try to add some details? I'd be happy to help.

torresalavez commented 1 year ago

@observingClouds @fjansson I uploaded COSMO hourly outputs (at 2 km of horizontal resolution) for BC. You can find the containers for CTRL and PGW simulations. For each experiment, I created 2 directories , containing 3D fields (U,V, W, QV, QI, QC) and 2D fields (TS and PS). Please let me know if you have any question.

observingClouds commented 1 year ago

Thanks @torresalavez for updating us on the progress. Could you add this information to the EUREC4A-MIP page here: https://github.com/eurec4a/how_to_eurec4a/blob/master/how_to_eurec4a/eurec4a_mip.md That would be great. The information is then available to everyone who is interested.

observingClouds commented 1 year ago

@torresalavez sorry to bother you. It would be great to have this information on howto.eurec4a.eu before the CF-MIP conference stars on July 9th. Intensive discussions about the EUREC4A-MIP will happen there and it would be great to have the access and documentation up-to-date by than.

observingClouds commented 1 year ago

@torresalavez examples where some additional explanation is needed are, e.g.:

observingClouds commented 1 year ago

@torresalavez also, the coordinates of the boundary conditions are incorrect (at least of COSMO 2D): image Maybe rlon(rlon=1451):units = "degrees" should be degrees_east? Even if I assume this domain is at the right location, then I wonder if the variable name is very fitting as there are also temperatures on the ocean. Please clarify the meaning of this variable also in the documentation.

torresalavez commented 1 year ago

In ncview the location is right. I do not know what is the problem in panoply. Could I share a python file to open it?

torresalavez commented 1 year ago

Where I can write the comments on the name of the variable? and the specifications on half and lowest layer in ERA5?

observingClouds commented 1 year ago

Where I can write the comments on the name of the variable? and the specifications on half and lowest layer in ERA5?

I suggest adding them somewhere in the boundary condition section. Just suggest an edit, make your additions, commit and create a Pull Request. I will take a look at it and merge it.

observingClouds commented 1 year ago

In ncview the location is right. I do not know what is the problem in panoply. Could I share a python file to open it?

Ncview is does not seem to interpret the coordinates (lat, lon) and the rotated grid information (rotated_grid), but rather shows just an image with the dimensions rlat and rlon. Panoply is smarter and actually goes the extra mile and looks up the correct coordinates for the dimension. I think what happened here is that the rotated_grid is incorrect. I cannot find too much information about the attribute grid_north_pole_longitude, but it is defined in degrees_east and probably sets the reference point for the rotated grid. In the current files grid_north_pole_longitude=-180 and therefore sets the reference at the date line. With ncatted -a 'grid_north_pole_longitude,rotated_pole,o,f,0' lffd20200101000000.nc lffd20200101000000_modified.nc I get the correct result: Screenshot 2023-07-12 at 22 29 02

To be honest, I would even suggest to get rid of the rotated_grid all together, because the data is on a regular grid, i.e. all lat and lon values are identical independent of their longitude and latitude. Only some minor rounding errors are present:

import numpy as np
import xarray as xr
ds = xr.open_dataset("lffd20200101000000.nc")
for row in ds.lon.values.T:
    print(np.unique(row)
# [-64.]
# [-63.98]
# [-63.96]
# [-63.94]
# [-63.92]
# [-63.9]
# [-63.88]
# [-63.86]
# [-63.84]
# [-63.82]
# [-63.8]
# [-63.78]
# [-63.76]
# [-63.739998]
# [-63.72]
# ....
for row in ds.lat.values:
     print(np.unique(row))
# [6.]
# [6.02]
# [6.04]
# [6.06]
# [6.08]
# [6.1]
# [6.12]
# [6.14]
# [6.16]
# [6.18      6.1800003]
# [6.2       6.2000003]
# [6.2200003]

This would drastically reduce the complexity and make it easier to use and understand.

I would suggest to aim to create files with the following metadata:

dimensions:
    time = UNLIMITED ; // (1 currently)
    bnds = 2 ;
-   rlon = 1451 ;
-   rlat = 901 ;
+   lon = 1451 ;
+   lat = 901 ;
variables:
    double time(time) ;
        time:standard_name = "time" ;
        time:long_name = "time" ;
        time:bounds = "time_bnds" ;
        time:units = "seconds since 2020-01-01T00:00:00Z" ;
        time:calendar = "proleptic_gregorian" ;
        time:axis = "T" ;
    double time_bnds(time, bnds) ;
-   float lon(rlat, rlon) ;
+   float lon(lon) ;
        lon:standard_name = "longitude" ;
        lon:long_name = "longitude" ;
        lon:units = "degrees_east" ;
        lon:_CoordinateAxisType = "Lon" ;
-   float lat(rlat, rlon) ;
+   float lat(lon) ;
        lat:standard_name = "latitude" ;
        lat:long_name = "latitude" ;
        lat:units = "degrees_north" ;
        lat:_CoordinateAxisType = "Lat" ;
-   double rlon(rlon) ;
-       rlon:standard_name = "projection_x_coordinate" ;
-       rlon:long_name = "rotated longitude" ;
-       rlon:units = "degrees" ;
-       rlon:axis = "X" ;
-   double rlat(rlat) ;
-       rlat:standard_name = "projection_y_coordinate" ;
-       rlat:long_name = "rotated latitude" ;
-       rlat:units = "degrees" ;
-       rlat:axis = "Y" ;
-   int rotated_pole ;
-       rotated_pole:long_name = "coordinates of the rotated North Pole" ;
-       rotated_pole:grid_mapping_name = "rotated_latitude_longitude" ;
-       rotated_pole:grid_north_pole_latitude = 90.f ;
-       rotated_pole:grid_north_pole_longitude = -180.f ;
-   float PS(time, rlat, rlon) ;
+   float PS(time, lat, lon) ;
        PS:standard_name = "surface_air_pressure" ;
        PS:long_name = "surface pressure" ;
        PS:units = "Pa" ;
-       PS:grid_mapping = "rotated_pole" ;
        PS:coordinates = "lat lon" ;
        PS:cell_methods = "time: point" ;
-   float T_S(time, rlat, rlon) ;
+   float T_S(time, lat, lon) ;
        T_S:long_name = "soil surface temperature" ;
        T_S:units = "K" ;
-       T_S:grid_mapping = "rotated_pole" ;
        T_S:coordinates = "lat lon" ;
        T_S:cell_methods = "time: point" ;

Does this make sense to you @torresalavez ?

torresalavez commented 1 year ago

I agree, maybe we have to keep both data sets (the original and rotated) and the users decide. We should explain the curvilinear grid of the model and the changes/errors in the second dataset.

observingClouds commented 1 year ago

As far as I'm concerned the data structure looks like it is on a curvilinear grid, but it actually isn't. It is already on a regular grid. rlat and rlon are, despite some numerical rounding artefacts, identical to the locations saved in lon and lat. And lon and lat could be 1D. There is no additional information in the 2D. lat does not change with lon and vice versa.

torresalavez commented 1 year ago

You are right this data is extracted from original (I processed it with cdo).