accord-net / framework

Machine learning, computer vision, statistics and general scientific computing for .NET
http://accord-framework.net
GNU Lesser General Public License v2.1
4.49k stars 1.99k forks source link

Time series prediction sample #592

Open ragnar0K opened 7 years ago

ragnar0K commented 7 years ago

Hello, I am just wondering - maybe this is not an issue. When you do this in the samples (Resilient Backpropagation -> Sample data (Time Series))

for (int i = 0, n = data.Length - windowSize; i < n; i++)
                {
                    // put values from current window as network's input
                    for (int j = 0; j < windowSize; j++)
                    {
                        networkInput[j] = (data[i+j] - yMin) * factor - 0.85;
                    }

You are also using the prediction window as an input. This doesn't make it a prediction (rather, a deviation from the input). Indeed, training did not consider the prediction window. In fact, changing the csv values within the prediction window will change the result...

Am I missing something? Sorry if that is the case.

penatolia commented 7 years ago

Hi,

I am wondering too. I am trying to create a prediction series using this method. But my network is not in a real prediction. Is there anything you can suggest ?

@cesarsouza

cesarsouza commented 7 years ago

Hi @ragnar0K, @penatolia,

Somehow I missed this question when it was posted back in May, sorry about that. It seems there is indeed an issue with the time series sample application when using a prediction window larger than 1. However, for a prediction window of 1 the network is correct and making valid and real predictions. I will try to explain what is happening below.

In the Time Series sample, the network is trained to predict the next value that would come after a fixed window of N values have been seen. This is the simplest way to perform time-series prediction with a classifier that can only take a finite number of inputs. You can find more details about the method here:

If you are using a prediction window of 1, then the code is correct, the approach is valid, and the network is indeed making real predictions (about the immediate future: the very first next observation).

However, it seems there is indeed a problem in the sample application when we set the prediction window to be anything higher than 1. The TimeSeries prediction examples were ported from the original Time Series sample applications from AForge.NET, and have been left largely unchanged since then. The behavior we see is also present in the original application over there:

There are two alternatives we could consider to support larger prediction windows:

1) The network can still be created with N inputs and a single output, where N is the width of the input window. However, we would need to change the evaluation code to put values from both the current window as well as the latest network outputs in the input until the length of the prediction window is exhausted.

2) The network could be created with N inputs and M outputs, where N is the width of the input window and M is the width of the prediction window. We would need to change the way the output data is prepared from

    // set input
    for (int j = 0; j < windowSize; j++)
        input[i][j] = (data[i + j] - yMin) * factor - 0.85;
    // set output
    output[i][0] = (data[i + windowSize] - yMin) * factor - 0.85;
to something like
```csharp
// set input
for (int j = 0; j < windowSize; j++)
    input[i][j] = (data[i + j] - yMin) * factor - 0.85;
// set output
for (int j = 0; j < predictionSize; j++)
    output[i][j] = (data[i + windowSize + j] - yMin) * factor - 0.85;
```
and then change the way the network is evaluated as in item 1.

However, ideally, this mechanism should be implemented as a separate class (i.e. let's call it TimeSeriesLearning) that could be used to train any IClassifier<TInput, int> using this time unrolling technique. This would allow to train logistic regression, SVMs and others for time series prediction.

If anybody would like to send a pull request I would be happy to review (even if its just to fix the sample application code, instead of this new TimeSeriesLearning class)

Regards, Cesar