verification script of spanish team

konradmayer commented 1 year ago

related #6

konradmayer commented 1 year ago

To be able to run with the DeepR code, our output needs to be modified:

datasets merged (as xarray has problems to merge the files) with cdo mergetime
rename of variable for baseline and model dataset to prediction using cdo chname
removing of the grid_mapping attribute in the samos data and selection of the prediction variable (i.e. removing the crs variable - if grid_mapping is existing cdo and nco operators won't remove the variable) with ncatted -a grid_mapping,prediction,d,, samos.nc followed by cdo select
rename dimensions in baseline and model data to match the cerra dimension names: for the baseline e.g. ncrename -d x,longitude era5_regridded.nc, ncrename -d y,latitude era5_regridded.nc, then cdo chname,x,longitude,y,latitude era5_regridded.nc out.nc
the validation script cannot handle na values - there are some present on the edges due to the cdo bilinear interpolation in the baseline as well as samos output. clip all data on the edges with ncks -d latitude,5,159 -d longitude,2,237 era5_regridded.nc -O out.nc

modified data currently at /scratch/klifol/kmayer/tmp/code4earth/validation

konradmayer commented 1 year ago

config file at resources/configuration_validation_netcdf.yml

# Define the anchor for the 'locations' list
locations_anchor: &locations
  [
    { "name": "Ibiza-Baleares", "lat": 38.9067, "lon": 1.4206 },
    { "name": "Mallorca-Baleares", "lat": 39.6953, "lon": 3.0176 },
    { "name": "Pyrenees-Spain", "lat": 42.5751, "lon": 1.6536 },
    { "name": "Madrid-Spain", "lat": 40.4168, "lon": -3.7038 },
    { "name": "Barcelona-Spain", "lat": 41.3851, "lon": 2.1734 },
    { "name": "Picos_de_Europa-Spain", "lat": 43.1963, "lon": -4.7461 },
    { "name": "Alicante-Spain", "lat": 38.3452, "lon": -0.4810 },
    { "name": "Valencia-Spain", "lat": 39.4699, "lon": -0.3763 },
    { "name": "Malaga-Spain", "lat": 36.7213, "lon": -4.4213 },
    { "name": "Almeria-Spain", "lat": 36.8381, "lon": -2.4597 },
    { "name": "Alboran_Sea-Spain", "lat": 35.9393, "lon": -3.2231 },
    { "name": "Balearic_Sea-Spain", "lat": 39.8223, "lon": 2.6480 }
  ]

validation:
  model_name: "samos"
  model_predictions_location: /scratch/klifol/kmayer/tmp/code4earth/datasets/model/ #/ssea/SSEA/C4E/DATA/TESTING/SAMOS/postprocessed
  baseline_name: "bicubic"
  baseline_predictions_location: /scratch/klifol/kmayer/tmp/code4earth/datasets/baseline/ #/ssea/SSEA/C4E/DATA/TESTING/PREPROCESSED/ERA5_regridded
  observations_name: "cerra"
  observations_location: /scratch/klifol/kmayer/tmp/code4earth/datasets/cerra/ #/ssea/SSEA/C4E/DATA/TESTING/PREPROCESSED/CERRA
  visualization_types:
    metrics_global_map:
      show_baseline: True
      color_palette: None
    sample_observation_vs_prediction:
      number_of_samples: 10
    time_series_for_a_single_site:
      locations: *locations
      temporal_subset: None
      aggregate_by: ["1D", "7D", "15D", "1M"]
      color_palette: None
    error_time_series_for_a_single_site:
      locations: *locations
      temporal_subset: None
      aggregate_by: [ "1D", "7D", "15D", "1M" ]
      color_palette: None
    error_distribution_for_a_single_site:
      locations: *locations
      temporal_subset: None
      color_palette: None
    boxplot_for_a_single_site:
      locations: *locations
      group_by: ["hour", "month", "season"]
      color_palette: None
  validation_dir: /scratch/klifol/kmayer/tmp/code4earth/validation/

konradmayer commented 1 year ago

job gets killed after running out of memory on vsci-dev (64GB RAM)

konradmayer commented 1 year ago

currently try to make it run on our SLURM cluster, but for some reason the scheduler does not give me more than 16GB RAM. IT is already investigating

konradmayer commented 1 year ago

high RAM demand obviously came from inconsistent dimension naming among datasets (and therefore higher dimensional output). there is another error now which i'll investigate further tomorrow

konradmayer commented 1 year ago

works now after all changes to out output as outlined in https://github.com/ECMWFCode4Earth/tesserugged/issues/11#issuecomment-1713829586 (still need to put this into a script)

ECMWFCode4Earth / tesserugged

verification script of spanish team #11