PCMDI / cmor

Climate Model Output Rewriter
BSD 3-Clause "New" or "Revised" License
50 stars 33 forks source link

Check consistency between frequency and table name. #691

Closed taylor13 closed 1 month ago

taylor13 commented 1 year ago

In the CMIP6 archive, there are cases where the CMOR table and the "frequency" are inconsistent. For example, search on Table "day", and you will find files contributed with frequency labeled "mon" (and vice verse). Can files written with CMOR give such an error? Can files checked with PrePARE miss this error?

mauzey1 commented 1 year ago

@taylor13 Could you point to examples of this happening in the CMIP6 archive?

taylor13 commented 1 year ago

for a listing, go to https://esgf-node.llnl.gov/search/cmip6/ and search on frequency="mon" and table="day" and it should find 1984 datasets. They're all from one model in this case. There are, I think, a few more instances, but I'm guessing this may not be a cmor/prepare problem, but operator error (in publishing). I was getting many more mismatches when I used the https://aims2.llnl.gov/search interface, so the new search tool might have a bug.

durack1 commented 1 year ago

@taylor13 this search https://esgf-node.llnl.gov/search/cmip6/?frequency=mon&table_id=day gives 1984 results.

Looks like this might be a DWD issue id = CMIP6.DCPP.DWD.MPI-ESM1-2-LR.dcppA-hindcast.s1996-r6i1p1f1.day.tasmax.gn.v20220126|esgf.dwd.de version = 20220126

durack1 commented 3 months ago

@mauzey1 this looks like another "trivial" 3.9.0 candidate - noting the milestone

taylor13 commented 2 months ago

Does anyone know whether Deutscher Wetterdienst (DWD) used CMOR? If not, then nothing need be done.

durack1 commented 2 months ago

@taylor13

:cmor_version = "3.5.0" ;

// global attributes:
        :Conventions = "CF-1.7 CMIP-6.2" ;
        :activity_id = "ScenarioMIP" ;
        :branch_method = "standard" ;
        :branch_time_in_child = 60265. ;
        :branch_time_in_parent = 60265. ;
        :contact = "cmip6-mpi-esm@dkrz.****" ;
        :creation_date = "2019-12-06T08:17:57Z" ;
        :data_specs_version = "01.00.30" ;
        :experiment = "update of RCP8.5 based on SSP5" ;
        :experiment_id = "ssp585" ;
        :external_variables = "areacella" ;
        :forcing_index = 1 ;
        :frequency = "mon" ;
        :further_info_url = "https://furtherinfo.es-doc.org/CMIP6.DWD.MPI-ESM1-2-HR.ssp585.none.r2i1p1f1" ;
        :grid = "gn" ;
        :grid_label = "gn" ;
        :history = "2019-12-06T08:17:57Z ; CMOR rewrote data to be consistent with CMIP6, CF-1.7 CMIP-6.2 and CF standards." ;
        :initialization_index = 1 ;
        :institution = "Deutscher Wetterdienst, Offenbach am Main 63067, Germany" ;
        :institution_id = "DWD" ;
        :mip_era = "CMIP6" ;
        :nominal_resolution = "100 km" ;
        :parent_activity_id = "CMIP" ;
        :parent_experiment_id = "historical" ;
        :parent_mip_era = "CMIP6" ;
        :parent_source_id = "MPI-ESM1-2-HR" ;
        :parent_time_units = "days since 1850-1-1 00:00:00" ;
        :parent_variant_label = "r2i1p1f1" ;
        :physics_index = 1 ;
        :product = "model-output" ;
        :project_id = "CMIP6" ;
        :realization_index = 2 ;
        :realm = "atmos" ;
        :references = "MPI-ESM: Mauritsen, T. et al. (2019), Developments in the MPI‐M Earth System Model version 1.2 (MPI‐ESM1.2) and Its Response to Increasing CO2, J. Adv. Model. Earth Syst.,11, 998-1038, doi:10.1029/2018MS001400,\n",
            "Mueller, W.A. et al. (2018): A high‐resolution version of the Max Planck Institute Earth System Model MPI‐ESM1.2‐HR. J. Adv. Model. EarthSyst.,10,1383–1413, doi:10.1029/2017MS001217" ;
        :source = "MPI-ESM1.2-HR (2017): \n",
            "aerosol: none, prescribed MACv2-SP\n",
            "atmos: ECHAM6.3 (spectral T127; 384 x 192 longitude/latitude; 95 levels; top level 0.01 hPa)\n",
            "atmosChem: none\n",
            "land: JSBACH3.20\n",
            "landIce: none/prescribed\n",
            "ocean: MPIOM1.63 (tripolar TP04, approximately 0.4deg; 802 x 404 longitude/latitude; 40 levels; top grid cell 0-12 m)\n",
            "ocnBgchem: HAMOCC6\n",
            "seaIce: unnamed (thermodynamic (Semtner zero-layer) dynamic (Hibler 79) sea ice model)" ;
        :source_id = "MPI-ESM1-2-HR" ;
        :source_type = "AOGCM" ;
        :sub_experiment = "none" ;
        :sub_experiment_id = "none" ;
        :table_id = "Amon" ;
        :table_info = "Creation Date:(09 May 2019) MD5:e6ef8ececc8f338646ebfb3aeed36bfc" ;
        :title = "MPI-ESM1-2-HR output prepared for CMIP6" ;
        :variable_id = "tas" ;
        :variant_label = "r2i1p1f1" ;
        :license = "CMIP6 model data produced by DWD is licensed under a Creative Commons Attribution ShareAlike 4.0 International License (https://creativecommons.org/licenses). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file) and. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law." ;
        :cmor_version = "3.5.0" ;
        :_NCProperties = "version=2,netcdf=4.6.2,hdf5=1.10.5" ;
        :tracking_id = "hdl:21.14100/b6a1c830-564b-4eb5-be87-33163502e300" ;
}
(base) -bash-4.2$ ncdump -h /p/css03/esgf_publish/CMIP6/ScenarioMIP/DWD/MPI-ESM1-2-HR/ssp585/r2i1p1f1/Amon/tas/gn/v20190710/tas_Amon_MPI-ESM1-2-HR_ssp585_r2i1p1f1_gn_201501-201912.nc
taylor13 commented 2 months ago

I note two things in the ncdump header listing that suggest this might not have been written with CMOR:

  1. There is a non-standard global attribute included: "_NCProperties".
  2. The "source" global attribute hosts a vector of text strings, but I think CMOR invariably writes a single text string.

The second item, once we verify that it is true, would definitively tell us that this dataset was not written with CMOR. If that is the case, then the inconsistency referred to above might not be a "CMOR" problem, although PrePARE should trap the error.

I note that the above file is not an example of the case that the frequency and table name are inconsistent (because they both indicate this is monthly data).

durack1 commented 2 months ago

@taylor13 I think this is a legit CMOR written file, an example with E3SM-1-1 (which definitely uses CMOR, wrapped) is below, same details and formatting, plus extras as they add/tweak after the fact ncclimo attributes

// global attributes:
        :Conventions = "CF-1.7 CMIP-6.2" ;
        :activity_id = "CMIP" ;
        :branch_method = "standard" ;
        :branch_time_in_child = 0. ;
        :branch_time_in_parent = 0. ;
        :contact = "Dave Bader (bader2@****)" ;
        :creation_date = "2019-12-11T17:36:13Z" ;
        :data_specs_version = "01.00.31" ;
        :experiment = "all-forcing simulation of the recent past" ;
        :experiment_id = "historical" ;
        :external_variables = "areacella" ;
        :forcing_index = 1 ;
        :frequency = "mon" ;
        :further_info_url = "https://furtherinfo.es-doc.org/CMIP6.E3SM-Project.E3SM-1-1.historical.none.r1i1p1f1" ;
        :grid = "data regridded to a CMIP6 standard 1x1 degree lonxlat grid from the native grid using an area-average preserving method." ;
        :grid_label = "gr" ;
        :history = "2019-12-11T17:36:13Z ;rewrote data to be consistent with CMIP for variable tas found in table Amon.;\n",
            "Output from 20181217.BDRD_CNPCTC20TR_OIBGC.ne30_oECv3.edison" ;
        :initialization_index = 1 ;
        :institution = "LLNL (Lawrence Livermore National Laboratory, Livermore, CA 94550, USA); ANL (Argonne National Laboratory, Argonne, IL 60439, USA); BNL (Brookhaven National Laboratory, Upton, NY 11973, USA); LANL (Los Alamos National Laboratory, Los Alamos, NM 87545, USA); LBNL (Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA); ORNL (Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA); PNNL (Pacific Northwest National Laboratory, Richland, WA 99352, USA); SNL (Sandia National Laboratories, Albuquerque, NM 87185, USA). Mailing address: LLNL Climate Program, c/o David C. Bader, Principal Investigator, L-103, 7000 East Avenue, Livermore, CA 94550, USA" ;
        :institution_id = "E3SM-Project" ;
        :mip_era = "CMIP6" ;
        :nominal_resolution = "100 km" ;
        :parent_activity_id = "CMIP" ;
        :parent_experiment_id = "piControl" ;
        :parent_mip_era = "CMIP6" ;
        :parent_source_id = "E3SM-1-1" ;
        :parent_time_units = "days since 0001-01-01" ;
        :parent_variant_label = "r1i1p1f1" ;
        :physics_index = 1 ;
        :product = "model-output" ;
        :realization_index = 1 ;
        :realm = "atmos" ;
        :references = "Burrows S.M., M.E. Maltrud, X. Yang, Q. Zhu, N. Jeffery, X. Shi, and D.M. Ricciuto, et al. 2019. \'The DOE E3SM coupled model v1.1 biogeochemistry configuration: overview and evaluation of coupled carbon-climate experiments.\' Journal of Advances in Modeling Earth Systems. In review; http://e3sm.org" ;
        :source = "E3SM 1.1 (2019): \n",
            "aerosol: MAM4 with resuspension, marine organics, and secondary organics (same grid as atmos)\n",
            "atmos: EAM (v1.1, cubed sphere spectral-element grid; 5400 elements with p=3; 1 deg average grid spacing; 90 x 90 x 6 longitude/latitude/cubeface; 72 levels; top level 0.1 hPa)\n",
            "atmosChem: Troposphere specified oxidants for aerosols. Stratosphere linearized interactive ozone (LINOZ v2) (same grid as atmos)\n",
            "land: ELM (v1.1, same grid as atmos; active biogeochemistry using the Converging Trophic Cascade plant and soil carbon and nutrient mechanisms to represent carbon, nitrogen and phosphorus cycles), MOSART (v1.1, 0.5 degree latitude/longitude grid)\n",
            "landIce: none\n",
            "ocean: MPAS-Ocean (v6.0, oEC60to30 unstructured SVTs mesh with 235160 cells and 714274 edges, variable resolution 60 km to 30 km; 60 levels; top grid cell 0-10 m)\n",
            "ocnBgchem: BEC (Biogeochemical Elemental Cycling model, NPZD-type with C/N/P/Fe/Si/O; same grid as ocean)\n",
            "seaIce: MPAS-Seaice (v6.0; same grid as ocean)" ;
        :source_id = "E3SM-1-1" ;
        :source_type = "AOGCM BGC AER" ;
        :sub_experiment = "none" ;
        :sub_experiment_id = "none" ;
        :table_id = "Amon" ;
        :table_info = "Creation Date:(24 July 2019) MD5:c93735846d66458966fc81f390b2d714" ;
        :title = "E3SM-1-1 output prepared for CMIP6" ;
        :variable_id = "tas" ;
        :variant_label = "r1i1p1f1" ;
        :license = "CMIP6 model data produced by E3SM is licensed under a Creative Commons Attribution ShareAlike 4.0 International License (https://creativecommons.org/licenses). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file) and at https:///pcmdi.llnl.gov/. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law." ;
        :cmor_version = "3.5.0" ;
        :_NCProperties = "version=2,netcdf=4.6.2,hdf5=1.10.5" ;
        :tracking_id = "hdl:21.14100/a360be6a-895f-4631-8db4-d07b50bd21b4" ;
        :e3sm_source_code_doi = "10.11578/E3SM/dc.20180418.36" ;
        :e3sm_paper_reference = "https://doi.org/10.1029/2018MS001603" ;
        :e3sm_source_code_reference = "https://github.com/E3SM-Project/E3SM/releases/tag/v1.0.0" ;
        :doe_acknowledgement = "This research was supported as part of the Energy Exascale Earth System Model (E3SM) project, funded by the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research." ;
        :computational_acknowledgement = "The data were produced using resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231." ;
        :ncclimo_generation_command = "ncclimo --var=${var} -7 --dfl_lvl=1 --no_cll_msr --no_frm_trm --no_stg_grd --yr_srt=1 --yr_end=500 --ypf=25 --map=map_ne30np4_to_cmip6_180x360_aave.20181001.nc " ;
        :ncclimo_version = "4.8.1-alpha04" ;
}
(base) -bash-4.2$ ncdump -h /p/css03/esgf_publish/CMIP6/CMIP/E3SM-Project/E3SM-1-1/historical/r1i1p1f1/Amon/tas/gr/v20191211/tas_Amon_E3SM-1-1_historical_r1i1p1f1_gr_185001-185912.nc
mauzey1 commented 2 months ago

I noticed an inconsistency between the Metagrid and NetCDF metadata. Looking at the dataset CMIP6.DCPP.DWD.MPI-ESM1-2-LR.dcppA-hindcast.s1975-r2i1p1f1.day.tasmax.gn from the search https://aims2.llnl.gov/search/cmip6/?frequency=mon&table_id=day. The info in the "Metadata" tag for this dataset lists frequency: mon and table_id: day but the file header contains frequency = "mon" and table_id = "Amon". The frequency and table name in the file suggest that it's monthly data but the file name and table name in Metagrid suggest that it's daily.

Looking at the time dimension of the file, I can see that it is in increments of days not months. Does that make it daily data?

taylor13 commented 2 months ago

Is the source attribute passed via python to CMOR a single character string:

"E3SM 1.1 (2019): \n aerosol: MAM4 with resuspension, marine organics, and 
secondary organics (same grid as atmos)\n atmos: EAM (v1.1, cubed sphere 
spectral-element grid; 5400 elements with p=3; 1 deg average grid spacing; 
90 x 90 x 6 longitude/latitude/cubeface; 72 levels; top level 0.1 hPa)\n atmosChem: 
Troposphere specified oxidants for aerosols. Stratosphere linearized interactive 
ozone (LINOZ v2) (same grid as atmos)\n ..."

or an array of character strings:

"E3SM 1.1 (2019): \n",
"aerosol: MAM4 with resuspension, marine organics, and secondary organics (same grid as atmos)\n",
"atmos: EAM (v1.1, cubed sphere spectral-element grid; 5400 elements with p=3; 1 deg average grid spacing; 90 x 90 x 6 longitude/latitude/cubeface; 72 levels; top level 0.1 hPa)\n",
"atmosChem: Troposphere specified oxidants for aerosols. Stratosphere linearized interactive ozone (LINOZ v2) (same grid as atmos)\n",
          .
          .
          .

??? I'm trying to figure out whether a single character string with imbedded linefeeds gets rendered as several separate character strings by ncdump (using the "/n" to say where to break up the single string).

If the single string gets broken up to display it, then that would say the above files could be consistent with being written by CMOR.

I did learn that "_NCProperties" was not stored in the netCDF files until netcdf-4, which I didn't know about until now, since that version didn't exist when I was actively developing CMOR.

taylor13 commented 2 months ago

@mauzey1 said

Looking at the time dimension of the file, I can see that it is in increments of days not months. Does that make it daily data?

Thanks for looking into this. Yes, if you read the coordinate values for time and they increment by 1, then the data is considered "day". (Note the units should be "days since ....".)

It looks like

It seems likely (but we need to look at ncdump to check) that the frequency stored as a global_attribute in the file is also incorrect (i.e., "mon", rather than "day"). If the global attribute in the file is "mon", then that means CMOR probably read a modified "day CMOR table", which in its header must have had an entry for the variable "tas" that incorrectly read:

            "frequency": "mon", 

It should have been "day". CMOR doesn't check whether the frequency in the table is consistent with the table's name.

In the latest CMOR tables, the frequency specified: See CMIP6_day.json

If the unlikely event that the global attribute for "frequency" was in fact reported correctly, then something must have gone wrong during publication of the dataset to change the frequency from "day" to "mon".

mauzey1 commented 1 month ago

I've tested the file that has daily data but has a monthly frequency and table by running it through PrePARE. Full output below.

PrePARE /Users/mauzey1/Downloads/tasmax_day_MPI-ESM1-2-LR_dcppA-hindcast_s1975-r2i1p1f1_gn_19751101-19851231.nc

C Traceback:
! In function: _CV_ValidateAttribute
! called from: _CV_checkGblAttributes
! 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Error: The attribute "license" could not be validated. 
! The current input value is "CMIP6 model data produced by DWD is licensed under a Creative Commons Attribution ShareAlike 4.0 International License (CC-BY-SA 4.0 https://creativecommons.org/licenses). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file) and. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.", which is not valid. 
! 
! Valid values must match those found in the "license" section
! of your Controlled Vocabular
!
!!!!!!!!!!!!!!!!!!!!!!!!!

C Traceback:
! In function: _CV_ValidateAttribute
! called from: _CV_checkGblAttributes
! 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Error: The attribute "nominal_resolution" could not be validated. 
! The current input value is "200 km", which is not valid. 
! 
! Valid values must match those found in the "nominal_resolution" section
! of your Controlled Vocabulary (CV) file "Tables/CMIP6_CV.json".
! 
!
!!!!!!!!!!!!!!!!!!!!!!!!!

C Traceback:
! In function: _CV_checkGblAttributes
! 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Error: Your Control Vocabulary file specifies one or more
! required attributes.  The following
! attribute was not properly set.
! 
! Please set attribute: "source" in your input file.
!
!!!!!!!!!!!!!!!!!!!!!!!!!

C Traceback:
! In function: _CV_checkGblAttributes
! 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Error: Please fix required attributes mentioned in
! the warnings/error above and rerun. (aborting!)
! 
!
!!!!!!!!!!!!!!!!!!!!!!!!!

C Traceback:
In function: _CV_setInstitution
! 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Warning: Your input attribute institution "Deutscher Wetterdienst, Offenbach 63067, Germany" will be replaced with 
! "Deutscher Wetterdienst, Offenbach am Main 63067, Germany" as defined in your Control Vocabulary file.
! 
!
!!!!!!!!!!!!!!!!!!!!!!!!!

C Traceback:
! In function: _CV_checkGrids
! 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Error: Your attribute grid_resolution is set to "200 km" which is invalid.
! 
! Check your Control Vocabulary file "".
! 
!
!!!!!!!!!!!!!!!!!!!!!!!!!

C Traceback:
! In function: 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Error: Your global attribute "creation_date" set to "01-28-22TJan:58:1643385537Z" is not a valid date.
! ISO 8601 date format "YYYY-MM-DDTHH:MM:SSZ" is required.
! 
!
!!!!!!!!!!!!!!!!!!!!!!!!!

C Traceback:
In function: _CV_checkParentExpID
! 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Warning: Your input attribute "parent_experiment_id" defined as "" will be replaced with 
! "no parent" as defined in your Control Vocabulary file.
! 
!
!!!!!!!!!!!!!!!!!!!!!!!!!

C Traceback:
In function: _CV_CompareNoParent
! called from: _CV_checkParentExpID
! 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Warning: Your input attribute branch_method with value "Ensemble Kalman Filter initialization" 
! will be replaced with value "no parent".
! 
!
!!!!!!!!!!!!!!!!!!!!!!!!!

C Traceback:
! In function: 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Error: Your filename 
! "tasmax_day_MPI-ESM1-2-LR_dcppA-hindcast_s1975-r2i1p1f1_gn_19751101-19851231.nc" 
! does not match the CMIP6 requirement.
! 
! Your output filename should be: 
! "tasmax_day_MPI-ESM1-2-LR_dcppA-hindcast_s1975-r2i1p1f1_gn_197511-198512.nc"
! 
! and should follow this template: 
!"<variable_id><table><source_id><experiment_id><member_id><grid_label>"
! 
! See your Control Vocabulary file.(Tables/CMIP6_CV.json)
! 
!
!!!!!!!!!!!!!!!!!!!!!!!!!

=====================================================================================
realization_index is not an integer: <class 'str'>
=====================================================================================

=====================================================================================
initialization_index is not an integer: <class 'str'>
=====================================================================================

=====================================================================================
physics_index is not an integer: <class 'str'>
=====================================================================================

=====================================================================================
forcing_index is not an integer: <class 'str'>
=====================================================================================

=====================================================================================
table_id attribute is not consistent: Amon
=====================================================================================

=====================================================================================
Your file contains "cell_methods":"area: time: maximum" and
CMIP6 tables requires "cell_methods":"area: mean time: maximum".
=====================================================================================

=====================================================================================
Your file contains "cell_measures":"areacella" and
CMIP6 tables requires "cell_measures":"area: areacella".
=====================================================================================

└──> :: CV FAIL    :: /Users/mauzey1/Downloads/tasmax_day_MPI-ESM1-2-LR_dcppA-hindcast_s1975-r2i1p1f1_gn_19751101-19851231.nc

Number of files scanned: 1
Number of file with error(s): 1

It does detect the inconsistency of the table name in the global attributes with that of the file name albeit while only showing the global attribute value. It also does not detect the inconsistency of the frequency.

The filename test has an interesting error where it suggests that the output file name should use monthly start and end dates instead of daily, but still uses the day table in the name.

taylor13 commented 1 month ago

This would be clearer if we had an ncdump of the global attributes and for the time coordinate values for the tasmax day file giving us problems. @durack1 provided a tas monthly file.

If DWD used CMOR to write the files or PrePARE to check the files it wrote, then it seems to me the most likely explanation for the problem files is that in the CMIP6_day.json file relied on by CMOR/PrePARE, the incorrect frequency was specified (as "mon" rather than "day"), but in the header of that file the correct "approx_interval": "1.00000", (units=day) was specified. I think the CMOR/PrePARE checks would miss the mismatch of the data with the frequency global attribute. Maybe you could rerun PrePARE as in the above example using a "day.json" file in which the frequency for tasmax has incorrectly been set to "mon".

mauzey1 commented 1 month ago

Here's an ncdump of the file header

ncdump -h /Users/mauzey1/Downloads/tasmax_day_MPI-ESM1-2-LR_dc
ppA-hindcast_s1975-r2i1p1f1_gn_19751101-19851231.nc
netcdf tasmax_day_MPI-ESM1-2-LR_dcppA-hindcast_s1975-r2i1p1f1_gn_19751101-19851231 {
dimensions:
        bnds = 2 ;
        time = UNLIMITED ; // (3714 currently)
        lat = 96 ;
        lon = 192 ;
variables:
        double time_bnds(time, bnds) ;
        double lat_bnds(lat, bnds) ;
        double height ;
                height:long_name = "height" ;
                height:standard_name = "height" ;
                height:units = "m" ;
                height:axis = "Z" ;
                height:positive = "Up" ;
        double lon_bnds(lon, bnds) ;
        double time(time) ;
                time:standard_name = "time" ;
                time:long_name = "time" ;
                time:units = "days since 1850-1-1 00:00:00" ;
                time:calendar = "standard" ;
                time:axis = "T" ;
                time:bounds = "time_bnds" ;
        double lon(lon) ;
                lon:standard_name = "longitude" ;
                lon:long_name = "longitude" ;
                lon:units = "degrees_east" ;
                lon:axis = "X" ;
                lon:bounds = "lon_bnds" ;
        double lat(lat) ;
                lat:standard_name = "latitude" ;
                lat:long_name = "latitude" ;
                lat:units = "degrees_north" ;
                lat:axis = "Y" ;
                lat:bounds = "lat_bnds" ;
        float tasmax(time, lat, lon) ;
                tasmax:_FillValue = 1.e+20f ;
                tasmax:missing_value = 1.e+20f ;
                tasmax:standard_name = "air_temperature" ;
                tasmax:long_name = "Daily Maximum Near-Surface Air Temperature" ;
                tasmax:units = "K" ;
                tasmax:coordinates = "height" ;
                tasmax:comment = "maximum near-surface (usually, 2 meter) air temperature (add cell_method attribute time: max)" ;
                tasmax:cell_methods = "area: time: maximum" ;
                tasmax:cell_measures = "areacella" ;

// global attributes:
                :CDI = "Climate Data Interface version 2.0.3 (https://mpimet.mpg.de/cdi)" ;
                :cdo_openmp_thread_number = 10 ;
                :NCO = "netCDF Operators version 4.9.2 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco)" ;
                :institute_id = "DWD" ;
                :model_id = "MPI-ESM-LR" ;
                :modeling_realm = "atmos" ;
                :Conventions = "CF-1.7 CMIP-6.2" ;
                :institution = "Deutscher Wetterdienst, Offenbach 63067, Germany" ;
                :frequency = "mon" ;
                :experiment_id = "dcppA-hindcast" ;
                :product = "model-output" ;
                :activity_id = "DCPP" ;
                :branch_method = "Ensemble Kalman Filter initialization" ;
                :contact = "klaus.pankatz@dwd.de" ;
                :creation_date = "01-28-22TJan:58:1643385537Z" ;
                :data_specs_version = "01.00.30" ;
                :experiment = "hindcast initialized based on observations and using historical forcing" ;
                :external_variables = "areacella" ;
                :forcing_index = "1" ;
                :further_info_url = "https://furtherinfo.es-doc.org/CMIP6.DWD.MPI-ESM1-2-LR.dcppA-hindcast.s1975.r2i1p1f1" ;
                :grid = "spectral T63; 192 x 96 longitude/latitude" ;
                :grid_label = "gn" ;
                :initialization_index = "1" ;
                :institution_id = "DWD" ;
                :mip_era = "CMIP6" ;
                :nominal_resolution = "200 km" ;
                :physics_index = "1" ;
                :project_id = "CMIP6" ;
                :realization_index = "2" ;
                :realm = "atmos" ;
                :references = "MPI-ESM: Mauritsen, T. et al. (2019), Developments in the MPI‐M Earth System Model version 1.2 (MPI‐ESM1.2) and Its Response to Increasing CO2, J. Adv. Model. Earth Syst.,11, 998-1038, doi:10.1029/2018MS001400" ;
                :source_id = "MPI-ESM1-2-LR" ;
                :source_type = "AOGCM" ;
                :sub_experiment = "initialized near end of year 1975" ;
                :sub_experiment_id = "s1975" ;
                :table_id = "Amon" ;
                :table_info = "Creation Date:(09 May 2019) MD5:390645ec184a69a5914a3b461e97af48" ;
                :title = "MPI-ESM1-2-LR output prepared for CMIP6" ;
                :tracking_id = "hdl:21.14100/370cb0c0-e2dc-4820-ade2-d9f00a203c7d" ;
                :variable_id = "tasmax" ;
                :variant_label = "r2i1p1f1" ;
                :license = "CMIP6 model data produced by DWD is licensed under a Creative Commons Attribution ShareAlike 4.0 International License (CC-BY-SA 4.0 https://creativecommons.org/licenses). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file) and. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law." ;
                :cmor_version = "3.5.0" ;
}
mauzey1 commented 1 month ago

Maybe you could rerun PrePARE as in the above example using a "day.json" file in which the frequency for tasmax has incorrectly been set to "mon".

I tried replacing "day" with "mon" for the frequency value of "tasmax" in CMIP6_day.json. However, the output from PrePARE showed no difference from that using the correct CMIP6_day.json.

durack1 commented 1 month ago

@wachsylon can you answer the question whether DWD used CMOR (or a CDO2CMOR wrapped version) to write these data?

durack1 commented 1 month ago

I suggest we ignore this issue. There are numerous QC issues suggesting that CMOR was not used directly, 3.6.0 was released in May 2020, whereas this file was generated in 2022, with the creation_date format not that written by CMOR, in addition to some other inconsistencies.

// global attributes:
                :CDI = "Climate Data Interface version 2.0.3 (https://mpimet.mpg.de/cdi)" ;
                :cdo_openmp_thread_number = 10 ;
                :NCO = "netCDF Operators version 4.9.2 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco)" ;
...
                :frequency = "mon" ;
...
                :branch_method = "Ensemble Kalman Filter initialization" ;
                :contact = "klaus.pankatz@dwd.de" ;
                :creation_date = "01-28-22TJan:58:1643385537Z" ;
...
                :cmor_version = "3.5.0" ;
}

I will close this issue, @wachsylon please reopen if my assumptions are incorrect