unit8co / darts

A python library for user-friendly forecasting and anomaly detection on time series.
https://unit8co.github.io/darts/
Apache License 2.0
7.89k stars 854 forks source link

multidimensional time series forecast #275

Closed szncoin closed 3 years ago

szncoin commented 3 years ago

Does darts support multi-dimensional time series forecast? That is, assume we have a time series of [feature1, feature2, feature3, target_variable], how to build forecast model to predict the 'target_variable' according to the three features? This is a bit different from traditional multivariate time series forecasting

LeoTafti commented 3 years ago

Hi @szncoin,

Darts does indeed support multi-dimensional (or "multivariate" time series) with its Neural Network based forecasting models. It's even recently been improved with v6.1!

The situation you describe would translate to having target_variable be your target time series, and [feature1, feature2, feature3] be so-called covariates. As such and assuming you have target_variable and covariates are TimeSeries objects, you could use darts in the following way:

model.fit(target_variable, covariates)
forecast = model.predict(20, target_variable, covariates)    # forecast 20 time-steps into the future

You can have a look at the section on covariates in this recent article here or in this example notebook here for more details.

szncoin commented 3 years ago

Hi @LeoTafti thanks for your quick reply. It is clear that the "multidimensional time series" can be implemented with covariates. Another question about forecast with covariate: currently darts support "past" covariates, however in my case, I actually have covariates "in advance", and my future target is predicted based on such covariates. Specifically, say a deep learning model is built using data (target and covariates time series objects) from t1 to t100, now I want to predict the target time series from t120 to t150 (covariates time series from t120 to t150 is already available), so in this case, how do I formalize covariates in model.predict() function? previously it is model.predict(nstep, target{t1..t100}, covariates_{t1..t100}), now it becomes to model.predict(nstep, target{t1..t100}, covariates_{t120-t150}) ? In the article you mentioned in the last post, such situation is considered in darts, but how to shift the covariate time series in the past is not quite clear. Kindly advice on this issue.

LeoTafti commented 3 years ago

Hi @szncoin,

It's a good question. You are indeed correct that Darts currently only supports "past" covariates, and that any future covariates have to be shifted.

To do so, you can use the TimeSeries.shift(n:int) method (with n<0 to shift into the past), which lets you specify the number of time steps n you want to shift your time series by.

hrzn commented 3 years ago

To complement on @LeoTafti' answer to your question @szncoin : If you have at your disposal target[t1...t100] and covariates[t1...t150], and you want to predict target[t101...t150], you can try doing something like this

model.fit(target, covariates=covariates[50:150])
pred = model.predict(n_step, series=target, covariates=covariates[50:150])
szncoin commented 3 years ago

@LeoTafti @hrzn thanks for your answers!

vl2376 commented 3 years ago

Hello ;

Related to multidimensional time series forecast : what if there are multiple targets ? E.g., in electricity load forecast, the targets would be electricity loads in different locations (one series for each location), possibly sharing the same covariates (e.g. some weather data). I understand it's possible to have multiple series targeted in the series argument, but will they interact with each other in the model ?

hrzn commented 3 years ago

@vl2376 First, keep in mind that in Darts one time series can contain several dimensions (also called "multivariate"), which simply means that the values at each time stamps are vectors (and not scalars). So in your case, if I understand well, if you want to get a model that forecasts all the locations "at once" from the covariates, you could opt for one series to represent the targets, with several dimensions in that series. E.g. dimension 0 would correspond to location 0, dimension 1 would correspond to location 1, etc. Another option could be to learn the model over several univariate series (one series of one dimension per location). In this latter case, you'll have more series to train on, and you'll also get one model that maps covariates to location forecast, but this model will not capture the dependencies between locations. So I would say which option is better is hard to tell without looking at the problem more in details: it depends on how many locations you have, what's the history size, and how independent from one another your locations are.