AI4S2S / s2spy

A high-level python package integrating expert knowledge and artificial intelligence to boost (sub) seasonal forecasting
https://ai4s2s.readthedocs.io/
Apache License 2.0
20 stars 7 forks source link

Adding a `n` kwarg to `Calendar.add_interval` #147

Closed BSchilperoort closed 1 year ago

BSchilperoort commented 1 year ago

While writing the demo notebook, and when trying the new calendar, I often found myself writing a for loop to add multiple intervals.

As writing for loops for this seems a bit silly, we could just wrap the looping into a method. E.g.:

cal = Calendar(anchor="Jan")
cal.add_intervals("precursor", "1W", n=10)  # Add 10 of these intervals

I see two options:

  1. A new method add_intervals, which internally will just run a for _ in range(n): loop and call add_interval.
  2. Modifying add_interval to accept n as a kwarg, with the default value of 1

I am leaning towards option 2 myself.

Peter9192 commented 1 year ago

I think I prefer option 1. An additional option could then be to fill one full year by default unless n is given.

jannesvaningen commented 1 year ago

I prefer option 2. It feels intuitive for me to have a function add_intervals which by default adds only one interval but adds more if you want to. If we add it the way Bart suggested, the calendar shifter (see my issue mention) doesn't need a function with lots of arguments to create calendars with 8 precursor periods for example. I think setting the default to a full year is not very common in s2s forecasting and you would also need to distinguish between precursor periods and target periods.

BSchilperoort commented 1 year ago

I do feel like it would be better to not fill in an entire year by default. This will force the user to think more about which periods are interesting, and how many would be needed. Needlessly adding many intervals could lead to behavior such as:

cal = Calendar(anchor="12-31")
cal.add_intervals("precursor", "5d")  # Defaults to filling entire year. I.e. 60 intervals
data_resampled = resample(cal, data)
...
data_resampled.sel(i_interval=slice(0,10))
semvijverberg commented 1 year ago

I'm also in favor of options 2, feel more intuitive and is computationally more efficient.

Peter9192 commented 1 year ago

Would it make sense to go with option 2 but renaming it to "add_intervals" then (with the s at the end, defaulting to n=1)?