lmcinnes / umap

Uniform Manifold Approximation and Projection
BSD 3-Clause "New" or "Revised" License
7.34k stars 798 forks source link

Add Dynamic Time Warping (DTW) distance #310

Open candalfigomoro opened 4 years ago

candalfigomoro commented 4 years ago

I think Dynamic Time Warping (DTW) distance would be a nice addition when we deal with time series.

While searching for UMAP + DTW on Google I found this implementation: https://gist.github.com/kylemcdonald/76b6f18fb4026e01196282b59bd31e7e

I didn't try it (I don't know if it correctly works), but I think it could be useful to add DTW to built-in distances in UMAP.

lmcinnes commented 4 years ago

I agree that DTW would be a useful metric to have. It is not trivial to implement; Kyle had a good solution that works around the issues with getting a pure numba implementation working. It is defintiely on my TODO list, but admittedly is currently not that high. If you feel like you could write a decent DTW routine that is numba compilable I would be very happy to accept a PR.

candalfigomoro commented 4 years ago

@lmcinnes Do you see any particular issue in kyle's implementation (apart from the comment about the dtw function naming)?

candalfigomoro commented 4 years ago

An "issue" I can see is that DTW could be used with multiple distances (not just euclidean distance), so it's like DTW should be a sort of wrapper for other distances.

lmcinnes commented 4 years ago

I believe this is something you can work around with some numba fanciness. In particular it is possible for distance functions to take (optional) parameters. Since numba supports passing numba jitted functions as parameters you could, in principle, pass in the underlying distance function as an optional parameter.

An added bonus may be looking at using numba stencil operators for some of the cumulative distance computations.

sleighsoft commented 4 years ago

https://github.com/lmcinnes/umap/blob/b17f0ab04e1b455e179ff14087a4a6f584fe1966/umap/layouts.py#L199

This is how you could dynamically wrap functions to be numba compatible.

msseibel commented 3 years ago

Here is a differentiable implementation of DTW soft-DTW. It is implemented in numba.

mblondel commented 3 years ago

Our code implements a differentiable divergence between time series, based on soft-DTW, as described in this paper. Our divergence is defined as D(X, Y) := SDTW(X, Y) - 0.5 * SDTW(X, X) - 0.5 * SDTW(Y, Y). Let me know if you have any question with using the code.