unit8co / darts

A python library for user-friendly forecasting and anomaly detection on time series.
https://unit8co.github.io/darts/
Apache License 2.0
8.11k stars 884 forks source link

[Question] How to assess if a model supporting (past) covariates understands its implications (e.g., no units in stock translates into no sales). #2099

Open fmerinocasallo opened 12 months ago

fmerinocasallo commented 12 months ago

Hi there! :)

First, thank you very much for developing Darts and making it open-source. I am currently trying to assess if several predictive models from Darts supporting past/future covariates (e.g., BlockRNNModel, NHiTS, TransformerModel, TFTModel, TiDEModel) are understanding the implications of such additional data. In particular, the target I want to predict is units sold, and I am using availability data as covariates. In this context, I expect predictive models to realize that we cannot sell a single unit as long as they are not available in stock.

For predictive models supporting future covariates, I should be able to easily verify that these models have understood this "law" simply by checking if their predictions during the validation period do not include any sale while a given product is out of stock.

How could I assess if predictive models supporting only past covariates comprehend this "law"? Is it possible for the model to make predictions over the training period?

Thank you in advance for considering my issue.

Kind regards, Fran

dennisbader commented 12 months ago

Hi @fmerinocasallo, not 100% sure I understand your question properly :)

Is it possible for the model to make predictions over the training period?

fmerinocasallo commented 12 months ago

Hey @dennisbader, sorry for not being clearer in my previous comment :sweat_smile: Let me try again :pray: .

I have the following data:

I aim to predict units sold in the future. Ideally, predictive models supporting past/future covariates should realize that whenever no units are available none can be sold. Still, we may have units in stock and do not sell any anyway. I will refer to this "law" as the "availability law" from now on.

First, let's see if my assumptions are valid:

  1. Predictive models that only support past covariates would use daily availability only during training. However, they cannot use this additional info to predict the future. In this case, would these predictive models assume a future scenario where units are always available to predict the future?
  2. Predictive models supporting future covariates would use daily availability during training and predicting into the future.

I want to check whether predictive models supporting past or future covariates do realize the availability law. For predictive models supporting future covariates, this assessment should be straightforward:

  1. Create a new instance of the model (e.g., TFTModel).
  2. Train the new instance using the training data split and the corresponding covariates.
  3. Predict units sold into the future (e.g., n = 30 [days]) considering the corresponding covariates. We could use the corresponding covariates from the validation split. We may also assume continuous availability when predicting the unknown future. Something similar to:
model = TFTModel(
    input_chunk_length=6,
    output_chunk_length=6,
)
model.fit(target, future_covariates=future_cov);
# Is `y_pred = model.predict(n=30)` equivalent to the following statement?
y_pred = model.predict(n=30, future_covariates=future_cov)

if y_pred[(future_cov == 0) & (y_pred != 0)].any():
    understanding = False
else:
    understanding = True

My question is associated with predictive models supporting only past covariates. In this case, how could we assess if the predictive model is realizing the availability law? I would start as before:

  1. Create a new instance of the model (e.g., NHiTS, TransformerModel).
  2. Train the new instance using the training data split and the corresponding covariates.
  3. :warning: Here comes my issue. As these models do not consider covariates when predicting the future, I cannot use the predict method for this assessment. Could I ask these predictive models to make predictions into the past (i.e., during the training period, when they did consider the availability info? This way, I could check whether they predicted no units sold on days when no units were available in stock.

Sorry for the long explanation :sweat_smile: Hopefully, this time I made myself clear :crossed_fingers:

PS: Although in the README.md there is a table where NHiTS and TransformerModel seems to only support past covariates, their predict method do have a future_covariates parameter. Am I missing something here?

dennisbader commented 12 months ago

Okay thanks, I understand now :) First for the PS: this is just to have a unified API for the TorchForecastingModel (our neural network-based models). It will raise an error when trying to pass future covariates to a model that doesn't support them.

Regarding covariates usage for the TorchForecastingModel: This user guide describes everything about how your input data is used for model training and prediction.

The models train and predict on fixed length input (past/history) and output (future) chunks. At prediction time you can pass any input series and covariates to predict(). The model will then extrect the input & output chunk from the end of your target series and align the covariates respectively.

So when you predict with a model that only supports past covariates, it will still receive the input_chunk_length last values of your past_covariates series as historic input. If the last past covariates value is 0 (not in stock) then you can assess whether the model respects your "availability law" by checking if the prediction is 0 units sold.

If you know before that your unit is out of stock the next day, then you could shift the in stock values by one day. Like that you included a future input into your covariates, and the model will receive it as a "historic" input.

I hope this cleared things up. Let me know if not.

fmerinocasallo commented 11 months ago

Let's see if I understand your suggestion to test my "availability law" for predictive models supporting only past covariates:

  1. Create a new instance of the model (e.g., NHiTS, TransformerModel).
  2. Train the new instance using the training data split and the corresponding past covariates.
  3. Predict units sold into the future (e.g., n = 30 [days]) considering the corresponding past covariates. We could pass a different past_covariates series to predict() with at least the last input_length_chunk items to 0 (a zeroed TimeSeries should also serve our purposes, shouldn't it?). Something similar to:
    
    model = NHiTSModel(
    input_chunk_length=6,
    output_chunk_length=6,
    )
    model.fit(target, past_covariates=past_cov);

idx = pd.date_range(past_cov.start_time(), periods=past_cov.n_timesteps, freq="D") ts = pd.Series(np.zeros_like(idx, dtype=int), index=idx) past_cov_zeroed = TimeSeries.from_series(ts) y_pred = model.predict(n=30, target, past_covariates=past_cov_zeroed)

if y_pred[(past_cov_zeroed == 0) & (y_pred != 0)].any(): understanding = False else: understanding = True

fmerinocasallo commented 11 months ago

I did some rereading and rethinking about this issue. Currently, my *_covariates TimeSeries stores the availability for the current date. Thus, *_cov['2023-01-01'] == 0 means that there were no units in stock on the 2023-01-01 (at 0:00).

However, when reading this user guide about covariates usage for the TorchForecastingModel (thanks @dennisbader for the link! :ok_hand:), I realized that I may have expected the impossible here :sweat_smile:.

Please, correct me if I missunderstood any of the following:

  1. Predictive models based on TorchForecastingModels only check the input chunk data points from the current i-th sequence or sub-sample of both my target and the corresponding *_covariates series to predict the output_chunk of the target's next (i+1)-th sequence.
  2. Still, the only relevant availability data points when predicting the output_chunk of the target's next (i+1)-th sequence are currently stored in the output_chunk of the *_covariates's next (i+1)-th sequence :grimacing: 2.1. Therefore, predictive models based on TorchForecastingModels (and probably any other predictive model available in Darts for that matter) do not check the only data points from the *_covariates series that are relevant to my availability law as they do not fall into the input_chunk time span.
  3. Thus, I cannot expect any predictive model based on TorchForecastingModels to realize my availability law :exploding_head:

To resolve this issue:

  1. I could shift my availability values by one day (as suggested by @dennisbader). 1.1. Then, *_cov['2023-01-01'] == 0 would mean that there were no units in stock on the 2023-01-01 at 23:59 instead of 0:00, or on the 23-01-02 at 0:00. This way, the predictive model would receive it as a "historic" input.

:warning: Note, however, that this solution would only be valid if my output_length_chunk == 1 unless I define a new pair of classes based on darts.utils.data.TrainingDataset and darts.utils.data.InferenceDataset to specify how to slice the data (target and *_covariates series) to obtain training and inference samples.

  1. A more general alternative would be to shift my availability values by output_length_chunk days. 2.1. This would result in all the relevant data points from the *_covariates series that are relevant to my availability law falling into the input_chunk time span, as long as input_length_chunk >= output_length_chunk.

Am I still missing something? :thinking:


Update: I am currently considering not using output_length_chunk > 1 as defining a new pair of classes based on darts.utils.data.TrainingDataset and darts.utils.data.InferenceDataset to correctly integrate availability data into predictive models supporting only past_covariates for this specific use case seems a little bit hacky :sweat_smile: Still, any input on this issue is more than welcome :relieved:

madtoinou commented 2 months ago

Hi @fmerinocasallo,

Sorry for the delay, I am going to try to answer you point by point as it might also help also users:

Statements:

  1. This is correct, input_chunk_length values of target and past covariates, output_chunk_length for the future covariates.
  2. Because of the distinction between past/future, you can actually encode the information of the availability of any given product for the forecasted period in the future covariates. Your statement is correct for the models supporting only past covariates.
  3. Since "availability" is something bound to the future, the TorchForecastingModels supporting only past covariates won't be able to take into account the availability law out of the box, indeed.

Mitigations:

  1. Shifting the information by 1 unit of frequency would effectively allow models supporting past covariates to have insight into the availability of the product. This approach is valid regardless of output_chunk_length value, since TorchForecastingModels look at input_chunk_length values in the past covariates. Hence, you should not have to modify the TrainingDataset class.
  2. I am not sure that shifting availability by output_chunk_length is necessary; this would mean zeroing a lot of values whereas shifting by just 1 step would allow the model to know that availability is compromised just before the forecasted period. And if you want to trick the model into having access to more information (whole forecasted period for example), since it will look at input_chunk_length values, it should be the shift applied to the values (as it might be smaller than output_chunk_length).

Since a lot of time went by, do you have any results/feedback to share about the implementation and the capabilities of each model to understand this availability law? Sorry that we took so long to get back to you, hope that it did not block you too much.