unit8co / darts

A python library for user-friendly forecasting and anomaly detection on time series.
https://unit8co.github.io/darts/
Apache License 2.0
8.08k stars 880 forks source link

Covariance slicing issue still persists for the ARIMA (ARMA) Model. #1892

Closed yigitcancomlek closed 1 year ago

yigitcancomlek commented 1 year ago

Seems like covariance slicing issue still persists for the ARIMA (ARMA) Model. I am still getting the same error with the below code

train_ts, val_ts= target_ts.split_before(train_cutoff) #40 samples for training # 150 samples for validation
train_val_cov  = covariates_ts; # all training and validation covariates (190x4)

model = ARIMA(p=1,d = 0,q=1)

model.fit(series = train_ts,
              future_covariates = covariates_ts)

predictions = model_ts.predict(n=150
              series=train_ts,  
              future_covariates=covariates_ts,
              num_samples = 100)

gives me the below error. Please let me know if I am making any mistake but if not, I believe the issue still persists. I have u8darts-all 0.24.0 downloaded in my conda environment

ValueError: Provided exogenous values are not of the appropriate shape. Required (110, 9), got (150, 9).

Originally posted by @yigitcancomlek in https://github.com/unit8co/darts/issues/843#issuecomment-1631496171

dennisbader commented 1 year ago

Hi @yigitcancomlek and thanks for raising this this issue. This indeed seems to be a bug. I'll add it to our backlog and it should be fixed and released by the end of July.

I add the fully reproducible example here:

from darts.models import ARIMA
from darts import TimeSeries
from darts.utils import timeseries_generation as tg
import pandas as pd

target_ts = tg.linear_timeseries(start=pd.Timestamp("2000-01-01"), freq="D", length=190)
covariates_ts = target_ts

train_ts, val_ts = target_ts[:40], target_ts[40:]  #40 samples for training # 150 samples for validation
train_val_cov = covariates_ts  # all training and validation covariates (190x4)

model = ARIMA(p=1, d=0, q=1)

model.fit(
    series=train_ts,
    future_covariates=covariates_ts
)

predictions = model.predict(
    n=150,
    series=train_ts,
    future_covariates=covariates_ts,
    num_samples=100
)
yigitcancomlek commented 1 year ago

Hello! Thank you very much for addressing the issue! As of right now, the above reproduced code still gives the same error for certain lengths (n>40) even there is future_covariatesfor n>40. Plus, I believe there could be a scaling issue with the predictions when future_covariatesare present. What I mean by that is the predictions with future_covariates results in a much worse predictions compared to vanilla ARMA. Even the very first step (n=1) is way off from final value of the train_ts

dennisbader commented 1 year ago

Yes the code above is just there for developers to reproduce the bug, it's not a fix. The fix is merged in the current master. #1893 explains what the issue was and how it was fixed.

The issue you're describing is also most likely fixed with #1893. ARIMA's simulate() before the fix was anchored at the start of the training series and not the end. This means it used future covariates also from too far into the past.