microsoft / FLAML

A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
https://microsoft.github.io/FLAML/
MIT License
3.91k stars 508 forks source link

Time Series cross-validator #350

Closed kurucan closed 2 years ago

kurucan commented 2 years ago

Is there a support for sklearn.model_selection.TimeSeriesSplit

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html

sonichi commented 2 years ago

Yes, you can set split_type="time". https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML#data-split-method

kurucan commented 2 years ago

Thanks but sklearn time series function have additional parameters like "gap", I do not see the additional params in the defined flaml function;

sklearn time series split function: class sklearn.model_selection.TimeSeriesSplit(n_splits=5, *, max_train_size=None, test_size=None, gap=0)

https://github.com/microsoft/FLAML/blob/main/flaml/automl.py elif self._split_type == "time":

logger.info("Using TimeSeriesSplit")

        if self._state.task == TS_FORECAST:
            period = self._state.fit_kwargs["period"]
            if period * (n_splits + 1) > y_train_all.size:
                n_splits = int(y_train_all.size / period - 1)
                assert n_splits >= 2, (
                    f"cross validation for forecasting period={period}"
                    f" requires input data with at least {3 * period} examples."
                )
                logger.info(f"Using nsplits={n_splits} due to data size limit.")
            self._state.kf = TimeSeriesSplit(n_splits=n_splits, test_size=period)
        else:
            **self._state.kf = TimeSeriesSplit(n_splits=n_splits)**
sonichi commented 2 years ago

@kurucan https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML#data-split-method You can use a custom splitter with your desired "gap" etc. Welcome to add a test case like in https://github.com/microsoft/FLAML/blob/2f5d6169d3b5cc025eb2516cbd003fced924a88e/test/automl/test_split.py#L158

@slhuang

kurucan commented 2 years ago

Many thanks! It works.