AI4S2S / s2spy

A high-level python package integrating expert knowledge and artificial intelligence to boost (sub) seasonal forecasting
https://ai4s2s.readthedocs.io/
Apache License 2.0
20 stars 7 forks source link

s2spy `Pipeline` design #62

Open BSchilperoort opened 2 years ago

BSchilperoort commented 2 years ago

To be able to support an sklearn style Pipeline (or sklearn's Pipeline itself) in which all steps can be integrated, all the fits/transformations done to the input data need to be classes with a fit & transform method.

All individual steps:

resampler = Resample(calendar)
data = resample.fit(data) # or .transform()?

detrender = Detrend(method='linear')
data = detrender.fit(data)

reducer = RGDR(series=timeseries, lag=10)
data = reducer.fit(data)

model = LinearSVC()
fit_model = model.fit(data)

These steps in a pipeline:

pipe = Pipeline([
    ('resample', resampler),
    ('detrend', detrender),
    ('apply rgdr', reducer),
    ('fit svc', model)
])
fit_model = pipe.fit(data)
Peter9192 commented 2 years ago

Might be good to also already think about the corresponding transform methods.