matrix-profile-foundation / matrixprofile

A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.
https://matrixprofile.org
Apache License 2.0
360 stars 62 forks source link

Motif discovery does only return one motif and the same neighbor multiple times #88

Open kantic opened 2 years ago

kantic commented 2 years ago

Describe the bug When I've tried to compute and discover motifs on my time series data, I noticed that the method mp.discover.motifs always returns only one motif and the same neighbor multiple times.

I've also tried to compute the motifs on the predefined "ecg-heartbeat-av" dataset by following exactly the steps as shown in this example of the documentary. Unfortunately, I encounter the same problem and get results which are different from the example in the documentary.

Also, when I change the parameter k=3, I get the same motif multiple times. Furthermore, no matter which value I set for the parameter max_neighbor, I always get the same single neighbor that many times.

To Reproduce Steps to reproduce the behavior: Just follow the steps mentioned in the example from the documentary:

import matrixprofile as mp
ecg = mp.datasets.load('ecg-heartbeat-av')
ts = ecg['data']
profile = mp.compute(ts, windows=150)
profile = mp.discover.motifs(profile, k=1)
mp.visualize(profile)

The motif plot I get looks like: motifs_neighbors_1 Note that I only get one neighbor, which is returned 10 times (default value of max_neighbors). If I would change, for example, k=3, I get the same motif three times.

Expected behavior The plot from the mentioned example: motifs_neighbors_2 As seen in the plot from the documentary, I would expect to get different neighbors. Also, I would expect that the method returns different motifs when the parameter k is increased.

Screenshots See the plots above.

Desktop (please complete the following information):

kantic commented 2 years ago

The problem is that in the documentation of the function top_k_motifs it is stated that the exclusion zone defaults to half of the window size. But later in the mp_top_k_motifs function, the value for exclusion_zone is read from the profile dict (under the key 'ez') generated by the mpx function, which sets this value to zero. This leads to the fact that the exclusion zones are not correctly set during the computation of the top k motifs.

When I manually set the exclusion zone to half of the window size, I get the expected results:

profile = mp.discover.motifs(profile, k=1, exclusion_zone=int(150/2))
Eric-Simon-IA commented 2 years ago

I have the same issue here : matrixprofile version 1.1.10 python 3.7.13 (google colab)

asked for 3 motifs on my TS data : ts.csv

ts = train_df['redo_writes_per_sec_'].values
window_size = 24
profile = mp.compute(ts, windows=window_size)
profile = mp.discover.motifs(profile, k=3)
figures = mp.visualize(profile)

And I get 3 times the same motif ! image image

Funny thing is that if works well if I pass a windows array with another value, even if the new value is not used profile = mp.compute(ts, windows=[24,168]) image image