ecmwf / anemoi-training

Apache License 2.0
17 stars 17 forks source link

feat: Allow noncontinuous date ranges in dataloader #129

Open HCookie opened 1 week ago

HCookie commented 1 week ago

Requires https://github.com/ecmwf/anemoi-datasets/pull/118

Allows

training:
  dataset: ${dataloader.dataset}
  ranges: 
    - [1970, 1980]
    - [1990, 2020]
  frequency: ${data.frequency}
  drop:  []

validation:
  dataset: ${dataloader.dataset}
  ranges: 
    - [1981, 1989]
    - [2021, 2021]
  frequency: ${data.frequency}
  drop:  []

Closes #128

floriankrb commented 5 days ago

Let's have a look at https://anemoi-datasets.readthedocs.io/en/latest/using/missing.html and have a chat.

How will you handle the last date of the first interval?
You do not want to train using x_i = 1980.12.31 18:00 and x_i+1 = 1990.01.01 00:00.

In the config, it may turn into this :

training:
  concat:
     -  dataset: ${dataloader.dataset}
         start: 1970
         end: 1980
     -  dataset: ${dataloader.dataset}
         start: 1990
         end: 2020
  frequency: ${data.frequency}
  how_to_handle_the_date_1980_12_31: 'raise'/'skip'...