Closed ThibHlln closed 3 years ago
Thank you @rich-HJ
I wonder whether we need to implement tighter time checks for 'climatologic' kind of data? At the moment, it checks for the right number of values available along the time dimension, but nothing else.
For example, when 'frequency': 'seasonal'
, it will check for 4 values available, but it does not check if it is MAM-JJA-SON-DJF, or another order. Likewise, when 'frequency': 'monthly'
, it will check for 12 values, but it does not check if a calendar year, a meteorological year, or a hydrological year is considered.
The alternative to tighter checks would be to choose a standard (i.e. default order/start of the year) for seasonal/monthly for HJ, and document it as a requirement somewhere. Maybe this is a non-problem and datasets out there are always following the same order for the seasons, and the same start for a year of climatology? As far as I could see in the CF-conventions, nothing is enforced in that regard, but since 'time' and 'time_bounds' are required for the climatology data, they do not need a standard.
I think it is fine to assume seasonal is the meteorological definition. DJF, MAM, JJA and SON. If people want as different definition they should have to expertise to implement it.
As for calendar. Do we need to allow met, hyd? I would stay with calendar to begin with and add options if there is great demand.
This sounds reasonable to me (i.e. expecting meteorological seasons and calendar year).
I am going to document that in the docstring of Component
for the argument dataset.
Another question.
To support other frequencies (e.g. the MODIS 10-day LAI), I've added support for a datetime.timedelta
in frequency. This infers the length of the time dimension by using the floor division of 366 days by this timedelta (giving the number of full sub-periods of length timedelta), and then by adding one if the remainder of the division is not 0 (to cover the last sub-period of length less than timedelta).
But this whole process assumes a 'gregorian' calendar (because this is what datetime
is based on). But the TimeDomain
of the component could be in another calendar, which is not very consistent.
So maybe asking for an integer in place of a timedelta is better? But e.g. timedelta(days=10)
should be 37
in a gregorian calendar, but only 36
in a 360-day calendar.
Not sure what is best here, and what we should support.
I dropped the support for timedelta for now. I replaced it by a support for an integer value if 'seasonal', 'monthly', or 'day_of_year' are not enough. The framework will check that the =the length of the time dimension in the dataset corresponds to this integer value.
resolve #7
The definition of a
Component
used to distinguish between 'driving_data' and 'ancillary_data', but this distinction was rather ambiguous (where would climatology data fit in?), community-specific (mostly UM world?), and limited (no time dimension allowed for ancillary).A component is now defined by just one item 'inputs' for the data given to it. Each input must be given a 'units' metadata (as was already the case) and a 'kind' metadata (newly added). The 'kind' can be:
SpaceDomain
, and for every time step of component'sTimeDomain
(i.e. both time and space dimensions are expected for the data array)SpaceDomain
(i.e. only space dimensions are expected for the data array)SpaceDomain
, and for a given number of sub-periods in a year period (i.e. both time and space dimensions expected for the data array, but length of time dimension not equal to number of time steps) – sub-periods defined in an additional 'frequency' metadata, e.g. 'seasonal', 'monthly', 'day_of_year', timedelta(days=7), etc.The definition of a component's inputs would look like this:
The distinction of 'inputs' into kinds allows for some checks on the compatibility between the data given and what the component needs. For 'dynamic' a full space and time check can be done, for 'static' a space check can be done (a time dimension may or may not exist, but if it does, it must be of size one), and for 'climatologic' a space check can be done alongside a check on the length of the time dimension compared to the expectation.