AI4S2S / s2spy

A high-level python package integrating expert knowledge and artificial intelligence to boost (sub) seasonal forecasting
https://ai4s2s.readthedocs.io/
Apache License 2.0
20 stars 7 forks source link

Refactoring of the calendar system using the CustomCalendar #113

Closed BSchilperoort closed 1 year ago

BSchilperoort commented 2 years ago

When the CustomCalendar refactoring is finished this branch can be merged into main.


Tasks:


closes #118, #114, #112, #100

adds functionality to be able to solve #47, #78

Before merging into main, the calendar_development branch should be removed from the CI in: build.yml, documentation.yml, and sonarcloud.yml.


Notes for the reviewer:

review-notebook-app[bot] commented 2 years ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

BSchilperoort commented 1 year ago

Note: before merging into main we should remove the calendar_development branch from the CI tests, in: build.yml, documentation.yml, and sonarcloud.yml.

BSchilperoort commented 1 year ago

Hi Peter, I've tackled some of your comments, but have some replies to some others:


Calendar classes

  • [ ] Why not make CustomCalendar --> Calendar?

We could rename it, and advertise it as the main calendar, while showing the Advent/Weekly/Monthly calendars as quick shorthands.

  • [ ] Could Weekly and Monthly calendar inherit from (Custom)Calendar?

Only if we would implement add_interval for them. Which is not my preference (see later answer).

  • [ ] Are there special methods on weekly/monthly calendar etc, or could they just be different constructors for a single calendar class.

Yes, they differ in how they are represented (e.g. print out month names or week numbers in the dataframe when calling .show()), and require a different implementation of _get_nintervals. I.e. they differ more than just in the constructor.

  • [ ] It might be nice to be able to start with a weekly calendar and then add/remove periods using the same api as the (custom) calendar

While it could be possible to implement add_interval for the Advent/Weekly/Monthly calendars, it would make the code more complex and messy, while being of little added value to the users (IMO). Currently these calendars try to fill an entire year, which means that you first have to change that (using calendar.set_max_lag), before you can append where you want. CustomCalendar supports appending weeks and months as well, e.g.:

calendar = CustomCalendar("W40-4")
calendar.add_interval("target", "1W")
[calendar.add_interval("precursor", "1W") for _ in range(10)]

is equivalent to;

calendar = WeeklyCalendar("W40-4")
calendar.set_max_lag(10)

__repr__s

  • [ ] n_targets not updated in customcalendar when adding a target period. Make it a @property?

n_targets could be a (derived) property for the CustomCalendar, but not for the other calendars. In the other calendars it is set by the user, and then used to determine how many target intervals should be generated.

  • [ ] repr of WeeklyCalendar should have it's freq arg in quotes: freq='1W' instead of freq=1W
  • [ ] The repr of the Calendar (in general) doesn't include the anchor.
  • [ ] The repr doesn't inlcude the mapped years, so you cannot reconstuct it

I will fix these.

  • [ ] Could it be possible to reconstruct a calendar by passing in a list of blocks directly?

Not now I changed add_interval. It could be possible to generate multiple intervals, but this is currently not supported. For example:

calendar.add_interval("precursor", length=["10d"]*10)

Misc

That would be a nice idea for the future (when the calendar is more set in stone), although I feel like saving the code to generate the calendar is easier and more reproducible.

Peter9192 commented 1 year ago

Thanks for your response Bart. I think I can agree with most of your points. Just wanted to further explain that the reason I'm so keen on serializing or reconstructing a calendar with a list of blocks, is that it is similar in pandas. There are some really nice shorthand constructors like pd.date_range, but you can always reproduce the resulting Index object from its repr:

In [1]: import pandas as pd

In [2]: pd.date_range('20220101', '20220131')
Out[2]:
DatetimeIndex(['2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04',
               '2022-01-05', '2022-01-06', '2022-01-07', '2022-01-08',
               '2022-01-09', '2022-01-10', '2022-01-11', '2022-01-12',
               '2022-01-13', '2022-01-14', '2022-01-15', '2022-01-16',
               '2022-01-17', '2022-01-18', '2022-01-19', '2022-01-20',
               '2022-01-21', '2022-01-22', '2022-01-23', '2022-01-24',
               '2022-01-25', '2022-01-26', '2022-01-27', '2022-01-28',
               '2022-01-29', '2022-01-30', '2022-01-31'],
              dtype='datetime64[ns]', freq='D')

In [3]: pd.DatetimeIndex(['2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04',
   ...:                '2022-01-05', '2022-01-06', '2022-01-07', '2022-01-08',
   ...:                '2022-01-09', '2022-01-10', '2022-01-11', '2022-01-12',
   ...:                '2022-01-13', '2022-01-14', '2022-01-15', '2022-01-16',
   ...:                '2022-01-17', '2022-01-18', '2022-01-19', '2022-01-20',
   ...:                '2022-01-21', '2022-01-22', '2022-01-23', '2022-01-24',
   ...:                '2022-01-25', '2022-01-26', '2022-01-27', '2022-01-28',
   ...:                '2022-01-29', '2022-01-30', '2022-01-31'],
   ...:               dtype='datetime64[ns]', freq='D')
Out[3]:
DatetimeIndex(['2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04',
               '2022-01-05', '2022-01-06', '2022-01-07', '2022-01-08',
               '2022-01-09', '2022-01-10', '2022-01-11', '2022-01-12',
               '2022-01-13', '2022-01-14', '2022-01-15', '2022-01-16',
               '2022-01-17', '2022-01-18', '2022-01-19', '2022-01-20',
               '2022-01-21', '2022-01-22', '2022-01-23', '2022-01-24',
               '2022-01-25', '2022-01-26', '2022-01-27', '2022-01-28',
               '2022-01-29', '2022-01-30', '2022-01-31'],
              dtype='datetime64[ns]', freq='D')
sonarcloud[bot] commented 1 year ago

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

93.6% 93.6% Coverage
0.0% 0.0% Duplication