Forecast method - Githubissues

HrudayR commented 2 years ago

As per my under standing of your paper ('Long Range Probabilistic Forecasting in Time-Series using High Order Statistics'), you are taking multiple (2) averages of the original series and then the informer model is being trained independently on these averages and the original series as well. I have a couple of questions here.

How many informer models are being trained?...i.e if we have 3 time series (2 aggregate and 1 original) does 1 informer model get trained on these 3 time series independently or does each time series get its own informer model ?
Once the model trained, a new multivariate gaussian is being defined with mean and covariance. How exactly is this mean being calculated ?

pratham16cse commented 2 years ago

Small clarification: The model that gets trained on time-series is not Informer, but it is inspired from Informer and one more paper "Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting". (We also have Informer model as the baseline, however currently the script will not run it by default.)

Onto your questions:

How many informer models are being trained?...i.e if we have 3 time series (2 aggregate and 1 original) does 1 informer model get trained on these 3 time series independently or does each time series get its own informer model ?

Each time series gets its own model.

Once the model trained, a new multivariate gaussian is being defined with mean and covariance. How exactly is this mean being calculated ?

The model predicts the mean vector. And we update that mean vector to establish coherency between original and aggregate predictions. Our model by itself does not predict the covariance matrix because estimation of the accurate covariance matrix is data-hungry. Instead, we obtain the covariance matrix by solving Eqn. 2.9 in our paper.

I hope this is clear.

Thank you for your interest. Prathamesh Deshpande

bcjanmay commented 2 years ago

Hello Prathamesh,

First of all, very good work, thank you for releasing the code, greatly appreciated.

Adding onto the first question asked here (we are collectively working on understanding your paper)

Q1) I have a time series for which I want to get the average and trend aggregate, can you please point me to the section of the code wherein this occurs?

Q2) The goal of your paper is in the pre-processing of input dataset, correct?

Q3) From Algorithm 2.1, Can you please help me make sense of the number of datapoints seen after the aggregate? For Example: If input has ten data points and horizon is two, 2,4,6,8,10,12,14,16,18,20 is the hourly time series for ten hours. How would the average and trend aggregate look in such a case?

Please let me know, thank you.

pratham16cse commented 2 years ago

Q1) I have a time series for which I want to get the average and trend aggregate, can you please point me to the section of the code wherein this occurs?

Line numbers 537-543 of utils.py calculate the aggregates depending on the values of self.aggregation_type and self.K. However, this particular snippet is written with the aim of generating these aggregates that will later be used to create training, validation and test batches of the aggregated data.

In case you are looking only to compute the aggregates, a better way would be to write a wrapper over aggregate_window function.

Q2) The goal of your paper is in the pre-processing of input dataset, correct?

Our goal is to create a forecasting method that can produce forecasts that are coherent across various levels and types of aggregation. We achieve this coherency through our novel inference method. We have used standard pre-processing techniques such as z-score normalization and un-normalization after forecasting. We apply similar pre-processing on the aggregated data as well.

Q3) From Algorithm 2.1, Can you please help me make sense of the number of datapoints seen after the aggregate? For Example: If input has ten data points and horizon is two, 2,4,6,8,10,12,14,16,18,20 is the hourly time series for ten hours. How would the average and trend aggregate look in such a case?

The aggregated series with aggregation_type='sum' and self.K=3 would look something like this:

# With aggregation_type='sum' and self.K=3
6, 12, 18
# Here, 6 is obtained by averaging (4,6,8), 12 is obtained by averaging (10,12,14) and so on.

# With aggregation_type='slope' and self.K=3
2, 2, 2

Note the following points:

aggregation_type='sum' actually calculates average and not sum.
In the example above, I have shown aggregated series by aggregating the series with non-overlapping windows of size self.K.
We omit 2 from the aggregation because the length of series needs to be divisible by self.K.
Also, depending on the value of which_type at line no 530, we aggregate with non-overlapping or overlapping windows. For which_type='train', we use overlapping windows so that we get more training samples for the aggregate model.

After aggregating the data, the batching procedure written at line 686 computes batches that are then fed to the model.

pratham16cse / AggForecaster

Forecast method #33