JuliaDynamics / TimeseriesPrediction.jl

Prediction of timeseries using methods of nonlinear dynamics and timeseries analysis
MIT License
103 stars 8 forks source link

Nearest Trajectory Strategy for Time Series Prediction #73

Open Datseris opened 5 years ago

Datseris commented 5 years ago

image

Currently the scheme for finding nearest neighbors works by not distinguishing neighbors based on where they are with respect to the existing dataset. Most of the time (and this is especially true for densely sampled data), nearest neighbors belong to a single or at best a couple of trajectory segments. This may mean that a specific trajectory segment is over-weighted for the prediction while other nearing segments are disregarded. The figure describes this perfectly.

It can be advantageous in some cases of timeseries predictions to expand the neighbor-finding strategy to "Nearest Trajectory". This process is described in the paper: "A Nearest Trajectory Strategy for Time Series Prediction" by James McNames. You can find the pdf here.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/70441129-nearest-trajectory-strategy-for-time-series-prediction?utm_campaign=plugin&utm_content=tracker%2F89129719&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F89129719&utm_medium=issues&utm_source=github).
Datseris commented 5 years ago

I do believe that it is not hard to implement a new neighborhood type that does this. It will be slow of course but it can work by finding some "test" neighbors given a point, and then for each of the neighbors check if they are close to each other in data indices. If yes then add more nearest neighbors incrementally.

Maybe the interface for this in NearestNeighbors is really poor though...

JonasIsensee commented 5 years ago

This has little to do with the interface. Your suggestion disregards the inner workings of the search algorithm.

The only way to incrementally add more nearest neighbors would be to restart the search with a larger k and consider the additional neighbors found.

It is likely much faster to just use a large k from the beginning so that researching becomes very rare.