unit8co / darts

A python library for user-friendly forecasting and anomaly detection on time series.
https://unit8co.github.io/darts/
Apache License 2.0
8.12k stars 885 forks source link

[QUESTION] Static covariants have no effect on prediction with XGBoost model #2587

Closed skull3r7 closed 2 weeks ago

skull3r7 commented 3 weeks ago

Describe the issue linked to the documentation I have recreated the example of https://unit8co.github.io/darts/examples/15-static-covariates.html for static covariants with the XGBoost model. Here, the different covariants have an influence on the prediction, depending on how the covariants were selected. Now I have tried to apply static covariants to the WeatherDataset. However, the forecast does not change here, regardless of which values you choose for the static covariants. I had expected the forecast to change at least minimally, but it turns out exactly the same (identical RMSE score).

The code:

    from darts.datasets import WeatherDataset
    from darts.models import XGBModel
    from darts.dataprocessing.transformers import StaticCovariatesTransformer
    from darts.metrics import rmse

    series = WeatherDataset().load()
    target = series['p (mbar)'][:100]
    past_cov = series['rain (mm)'][:100]
    future_cov = series['T (degC)'][:106]
    transformer = StaticCovariatesTransformer()

    #Model without static covariants
    model = XGBModel(
        lags=12,
        output_chunk_length=6,
    )
    model.fit(target)
    pred = model.predict(20, series=target[:80])
    pred.values()
    target.plot()
    pred.plot()

    #Model with static covariants
    model = XGBModel(
        lags=12,
        output_chunk_length=6,
    )

    df = pd.DataFrame(data={'val1': [1000000], 'val2': [1006], 'val3': [0]})
    target = target.with_static_covariates(df)

    model.fit(target)
    pred2 = model.predict(20, series=target[:80])

    plt.show()

    print(pred.static_covariates)
    print(pred2.static_covariates)
    print(f'RMSE: {rmse(target, pred)}')
    print(f'RMSE: {rmse(target, pred2)}')

Additional context darts==0.31.0

madtoinou commented 3 weeks ago

Hi @skull3r7,

I think that the forecasts are exactly identical because the model does not find any useful information in the static covariates. By adapting the code in this user guide, I easily verified that the XGBModel does leverage the information in this covariates:

import darts.utils.timeseries_generation as tg
from darts.models import XGBModel
from darts import TimeSeries

period = 20
sine_series = tg.sine_timeseries(
    length=4 * period, value_frequency=1 / period, column_name="smooth", freq="h"
)

sine_vals = sine_series.values()
linear_vals = np.expand_dims(np.linspace(1, -1, num=19), -1)

sine_vals[21:40] = linear_vals
sine_vals[61:80] = linear_vals
irregular_series = TimeSeries.from_times_and_values(
    values=sine_vals, times=sine_series.time_index, columns=["irregular"]
)
sine_series.plot()
irregular_series.plot()

def test_case(model, train_series, predict_series):
    """helper function which performs model training, prediction and plotting"""
    model.fit(train_series)
    preds = model.predict(n=int(period / 2), num_samples=1, series=predict_series)
    for ts, ps in zip(train_series, preds):
        ts.plot()
        ps.plot()
        plt.show()
    return preds

def get_model_params():
    """helper function that generates model parameters with a new Progress Bar object"""
    return {
        "lags": int(period / 2),
        "output_chunk_length": int(period / 2),
        }

train_series = [sine_series, irregular_series]
for series in train_series:
    assert not series.has_static_covariates

model = XGBModel(**get_model_params())
# the model clearly struggles with the irregular series when no covariates are provided
preds = test_case(
    model,
    train_series,
    predict_series=[series[:60] for series in train_series],
)

sine_series_st_bin = sine_series.with_static_covariates(
    pd.DataFrame(data={"curve_type": [1]})
)
irregular_series_st_bin = irregular_series.with_static_covariates(
    pd.DataFrame(data={"curve_type": [0]})
)

train_series = [sine_series_st_bin, irregular_series_st_bin]
for series in train_series:
    print(series.static_covariates)

model = XGBModel(**get_model_params())
# the model predicts perfectly the two series when static covariates are used
preds_st_bin = test_case(
    model,
    train_series,
    predict_series=[series[:60] for series in train_series],
)
dennisbader commented 3 weeks ago

Hi @skull3r7, static covariates are only useful if you have multiple target series with different static cov values. Otherwise, with only one series, your static covs are constant and do not provide any useful information.