FPS-URB-RCC / STAGE-0_Analysis

Share code and exchange on the joint analysis for the STAGE-0 simulatons
0 stars 0 forks source link

Inconsistent time splitting into files #3

Open jesusff opened 3 months ago

jesusff commented 3 months ago

collected from Michal's presentation on Stage-0 evaluation:

Inconsistently time-split files. Some models provide monthly files, some one file for the entire simulation period

jesusff commented 3 weeks ago

I'd say this should go in a single piece. In CORDEX, the minimum split of files is yearly (for 1hr and 6hr files), so here (4 months) we can have all data in a single file, regardless of frequency. In the stage-0 protocol, there is no explicit mention of the file granularity, but the examples show that data should be in one piece:

tasmax_EUR-12_ERA5_evaluation_r1i1p1f1_GERICS_REMO2020-iMOVE_v1-fpsurbrcc-s0r1_day_20200501-20200831.nc
pr_PARIS-3_ERA5_evaluation_r1i1p1f1_UCAN_WRF451R-CTRL_v1-fpsurbrcc-s0r1_1hr_202005010030-202008312330.nc 

For an instantaneous variable such as tas, we could include the last time step to have the full period covered:

tas_PARIS-3_ERA5_evaluation_r1i1p1f1_UCAN_WRF451R-CTRL_v1-fpsurbrcc-s0r1_1hr_202005010000-202009010000.nc 
jesusff commented 3 weeks ago

Currently there are no simulations following the pattern above for tas and there is just one (SMHI) using the choice 202005010000-202008312300: https://github.com/FPS-URB-RCC/STAGE-0_Analysis/blob/0265a89106bc1852d9a805bbc8564bafc184ba27/list_server_files__tas_1hr.txt#L76

jesusff commented 3 weeks ago

Related to the time splitting is the specification of the start and end dates. There is no ambiguity, each frequency has its own date precision:

StartTime and EndTime (structured form) indicate the time span of the file content. The format is YYYY[MM[DD[hhmm]]], i.e. the year is represented by 4 digits, while the month, day, hour, and minutes are represented by exactly 2 digits, if they are present at all (monthly output - YYYYMM, daily – YYYYMMDD, sub-daily - YYYYMMDDhhmm). The StartTime and EndTime of sub-daily instantaneous and average data are based on the time values of the first and last record in the file. The two dates are separated by a dash. All time stamps refer to UTC. Constant fields (frequency=fx) do not have the StartTime-EndTime element in their file names.

In particular, no YYYYMMDDhh is allowed. Subdaily files must include the minutes. Daily files must not include the hours.