winedarksea / AutoTS

Automated Time Series Forecasting
MIT License
1.05k stars 96 forks source link

[Library Info] Multivariate forecasting #182

Closed gabrielefantini closed 1 year ago

gabrielefantini commented 1 year ago

Hi, I'm trying to do multivariate forecasting. It's only about passing multiple time series as input and then specifying "grouping_ids" on .fit method to put emphasis on the most important one? I'm also trying to process multiple time series (decorrelated) in parallel. What is the best way to do it?

Thanks in advace, Gabriele.

winedarksea commented 1 year ago

It's not grouping ids you want but weights to .fit() which accepts a dictionary of column_name: weight.

and for the second question, I don't quite understand. What sort of processing are you talking about?

gabrielefantini commented 1 year ago

Hello, thank you for your response. So, when dealing with multivariate time series, the key is to assign greater weight to the most significant one? Regarding my second question, I was wondering if the library provides a method for accelerating the forecasting process for a large number of decorrelated time series.

winedarksea commented 1 year ago

For both questions, the use of ensemble=['horizontal-max', 'simple'] is a good idea. 'Horizontal' here is where a different model is chosen for each series. This works well on both correlated and uncorrelated collections of time series. In many ways, this bypasses the need for series weighting. In this case, series weighting only guides the overall search, not the final model selection. Also weights has some aliases, you can pass mean to have it automatically weight by the mean of the series, so bigger valued series are favored (ie your most valuable products in many cases).

There is also a parameter called subset which means that the model runs the search space on a portion of the series, for example if you are running 1000 series, you could pass subset=100. Usually, 100 time series will be more than enough to represent the population, and the models chosen on those 100 time series will work just as well for the larger population, while being faster to search.

gabrielefantini commented 1 year ago

Thanks a lot!