Open Datseris opened 5 years ago
I do believe that it is not hard to implement a new neighborhood type that does this. It will be slow of course but it can work by finding some "test" neighbors given a point, and then for each of the neighbors check if they are close to each other in data indices. If yes then add more nearest neighbors incrementally.
Maybe the interface for this in NearestNeighbors
is really poor though...
This has little to do with the interface. Your suggestion disregards the inner workings of the search algorithm.
The only way to incrementally
add more nearest neighbors would be to restart
the search with a larger k
and consider the additional neighbors found.
It is likely much faster to just use a large k
from the beginning so that researching becomes very rare.
Currently the scheme for finding nearest neighbors works by not distinguishing neighbors based on where they are with respect to the existing dataset. Most of the time (and this is especially true for densely sampled data), nearest neighbors belong to a single or at best a couple of trajectory segments. This may mean that a specific trajectory segment is over-weighted for the prediction while other nearing segments are disregarded. The figure describes this perfectly.
It can be advantageous in some cases of timeseries predictions to expand the neighbor-finding strategy to "Nearest Trajectory". This process is described in the paper: "A Nearest Trajectory Strategy for Time Series Prediction" by James McNames. You can find the pdf here.