unit8co / darts

A python library for user-friendly forecasting and anomaly detection on time series.
Apache License 2.0
7.91k stars 858 forks source link

Using only covariates for PyTorch (Lightning)-based Models #2151

Open FelixSaretzky opened 8 months ago

FelixSaretzky commented 8 months ago

For regression models purely relies on the covariates values, one can use lags=None. Is there also a possibility for PyTorch based models like BlockRNN?

Given is a dataframe:

Timestamp | Temperature | Humidity | Torque | Failure
2023-08-17 16:48:00 | 24 | 54 | 23 |  1

which was divided into run to failure time series. For instance series_1 = 200 rows of data until Failure ==1, series_2 = 1000 rows,..


time_series_pairs = {}

for key, df in processed_series_dict.items():
    # Create TimeSeries for the target
    target_series = TimeSeries.from_dataframe(df, value_cols=['rul'])

    # Create TimeSeries for the covariates
    covariate_series = TimeSeries.from_dataframe(df, value_cols=['Temperature', 'sensor1', 'sensor3'])

    # Store the pair in the new dictionary
    time_series_pairs[key] = (target_series, covariate_series)

# Extracting targets and covariates into separate lists
targets = [pair[0] for pair in time_series_pairs.values()]
covariates = [pair[1] for pair in time_series_pairs.values()]

# Train model
# LightGBMModel
Baseline_LightGBMModel = LightGBMModel(lags=None, lags_past_covariates=20, output_chunk_length=1)
Baseline_LightGBMModel.fit(series=train_targets, past_covariates=train_covariates)

Is this approach basically correct and can it be applied to PyTorch models such as BlockRNN?

madtoinou commented 8 months ago

Hi @FelixSaretzky,

This is not yet supported for torch models, due to the way data is tabularized (less flexibility than regression models at the moment) but depending on the model, it should not be too difficult to change the architecture (and batch handling) so that it does not try to access the lags of the targets.

Adding this to the roadmap, could probably be implemented together with #1406.

FelixSaretzky commented 8 months ago

Hi @madtoinou,

thanks to your fast response! But my approach for the multiple time sequences above is correct?

madtoinou commented 8 months ago

Yes, your approach looks correct for multiple series training.