Add DTW Motif Discovery - SWAMP

TDAmeritrade / stumpy

STUMPY is a powerful and scalable Python library for modern time series analysis

https://stumpy.readthedocs.io/en/latest/

Other

3.62k stars 318 forks source link

Add DTW Motif Discovery - SWAMP #466

Open seanlaw opened 3 years ago

seanlaw commented 3 years ago

The SWAMP paper can be found here

The supplementary webpage is here

The code is here

The data is here

Longer form dissertation is here

Additional details on implementing an efficient DTW algorithm can be found here (start at Slide 68)

NimaSarajpoor commented 2 years ago

@seanlaw I think the package tslearn might help those who want to work on SWAMP. For instance, it provides the implementation of LB_Keogh.

That might give us some idea that can be useful for pythonic implementation of SWAMP.

ken-maeda commented 1 year ago

I reproduced dtwMP by dtw calculation(from paper) and stump(k-top) MP.

SWAMP itself is from combination dtw and Lower bound for reducing calculation cost. Could you share with your idea how SWAMP is implemented in stumpy.

To DO SWAMP(original):

[x] : dtw MP
[ ] : LB MP
[ ] : Early abandon
[ ] : together as one

To DO(improvement):

[ ] : dtw supports "itakura" band also.
(paper code itself is only for "sakoe-chiba", but paper mentioned both)
[ ] : MP calculation optimaization.
(It don't have to be like MASS algorithm?)
[ ] : z-normalized on off
(paper code only support z-normalized.)
[ ] : pre/post process for inf, nan

Basically dtw calculation speed should be key factor, therefore I would consider improvement of the speed.

[Paper result]

[Python implement]

seanlaw commented 1 year ago

@ken-maeda Firstly, I have no experience with DTW and do not know if it is even a good idea to add SWAMP. Having said that, it is important that we do not add DTW calculation itself to STUMPY as it is likely beyond the scope (i.e., STUMPY cannot be focused on making DTW fast as we are only focused on MP). I haven't read the SWAMP paper but maybe you can explain (in a Jupyter notebook) how DTW integrates with MP? For SWAMP, is it possible to do DTW separate from MP or do they have to be computed together at runtime?

ken-maeda commented 1 year ago

is it possible to do DTW separate from MP or do they have to be computed together at runtime?

Yes. Dtw calculation is simple and short, but we can import dtw from other library for replacing pearson calculation. For exmaple, from tslearn.

Dtw(tsleran) is coded by njit. I wrote dtw(paper) in njit way also. But it took about 3 times longer calculation time from stump(njit). If we need to reduce calculation time, we have to code it by ourselfves for optimzed calculation, dask, cuda,,,

maybe you can explain (in a Jupyter notebook) how DTW integrates with MP?

Yes. I can put notebook, however the notebook is rude because it is just for algorithm check. How should I put notebook?

seanlaw commented 1 year ago

Dtw calculation is simple and short, but we can import dtw from other library for replacing pearson calculation. For exmaple, from tslearn.

Great. Let's just import DTW from tslearn and put everything in a Jupyter notebook like a tutorial style as you've done before and explain how everything works by annotating. You can still submit the notebook as a PR so that we can learn from it and then we can close the PR afterwards without merging. What do you think @ken-maeda?

seanlaw commented 1 year ago

There is also this new dtwParallel that seems to perform faster than tslearn. I cannot speak to the accuracy though.