Open leifdenby opened 10 months ago
As long as the behaviour is the same, no problem.
It would change the functionality in how to provide dates, datetimes and duration. So it depends on what you mean by "behaviour".
From the README on specifying date ranges it currently says:
The following are equivalent way of describing start or end:
2020 and "2020"
202306, "202306" and "2023-06"
20200301, "20200301" and "2020-03-01"
I don't think it is a good idea to support so many variations on how to format dates. For example if someone set start=201009
it would be ambiguous from just reading that input whether this refers to September 2010 or Oct 9th 2020. It is these kind of ambiguities ISO8601 was created to avoid.
Also, having so many variations means that another application needs to implement a lot of logic to be able to support parsing of the config file. I can for example imagine wanting to use these values in javascript (a frontend the visualises data ranges used for a dataset). ISO8601 is supported in many different javascript libraries (for example momentjs)
Finally, I think the current way the start
and end
can be provided, for example by only giving "2020" (I assume this is the year), implies that the start of the year when used for start
, but the end of the year when used for end
. Maybe this isn't the current behaviour, but I think we should add to the README if that is how its intended to work.
So what I suggest is that we depreciate the current format and only support strings formatted according to ISO8601. That will mean the README would instead read:
The `start` and `end` arguments for specifying time-spans should be provided as ISO8601 formatted strings, e.g.
YYYY e.g. "2020" (will imply start of year when used for `start` and end of year when used for `end`)
YYYY-MM e.g. "2023-06" (will imply start of month when used for `start` and end of month when used for `end`)
YYYY-MM-DD e.g. "2020-03-01" (will imply start of day when used for `start` and end of day when used for `end`)
YYYY-MM-DDTHH-MMZ e.g. "2024-01-26T10:32:57Z"
This will also change how frequencies should be given. In the ISO 8601 standard this are done with P
-prefix, e.g. PT10H
is 10 hours, PT7M
is 7 minutes, P7D
is 7 days. https://en.wikipedia.org/wiki/ISO_8601#Durations
You are welcome to implement the changes. Just make sure that they are backwards compatible for now, so that our current software stack still works. It is just a matter of accepting both types of input.
I would like to suggest that rather than using a customised data-related object serialisation
ecml-tools
adopts the ISO8601 standard. There is a really nice python package isodate for handling parsing of these and it makes it much easier to build other tools that can work with the same configuration files etc. Also, it avoids users having to try to understand a new format just to use the package.