wannesm / dtaidistance

Time series distances: Dynamic Time Warping (fast DTW implementation in C)
Other
1.09k stars 184 forks source link

How to add (source to target) causality between two 1D time series? (without using numpy tril) #159

Open bartmch opened 2 years ago

bartmch commented 2 years ago

Is there a better (computationally cheaper) way to quantify the relationship between a source>target time series instead of setting the lower triangle matrix to inf after computing it?

# These 2 lines are not really important but adding for any recommendations
s1 = zscore(dummy['ets_flow']); s2 = zscore(dummy['return_water_temp'])
s1_diff = preprocessing.differencing(s1.values[None,:], smooth=0.15).squeeze()
s2_diff = preprocessing.differencing(s2.values[None,:], smooth=0.15).squeeze()
# Calculate full matrix
d, paths = dtw.warping_paths(s1_diff, s2_diff, window=30, use_pruning=True)
# Set lower triangle part of the matrix to inf so it will not be considered by best_path
lower = np.tril_indices(paths.shape[0], 0)
paths[lower] = paths[0,-1] #inf
best_path = dtw.best_path(paths)
# Distance is inf?
dtwvis.plot_warpingpaths(s1, s2, paths, best_path)
plt.plot([0, len(paths)], [0, len(paths)],color='white')
image

Besides the computational overhead, I cannot seem to compute the best path if values above the matrix diagonal are set inf:

image
wannesm commented 2 years ago

There is no argument that does just that. But this seems easy to achieve by using a window and pre-/post-pending values to make it a onesided window (and optionally psi-relaxation to avoid the extra penalty from pre-/post-pending values).

# s2 can only map to later timestamps in s1
onesidedwindow = 20
window=int(onesidedwindow/2)
s1b = np.concatenate((np.full((window,), s1[0]), s1))
s2b = np.concatenate((s2, np.full((window,), s2[0])))
d, paths = dtw.warping_paths(s1b, s2b, window=window,
                             psi=(window,0,0,window))
best_path = dtw.best_path(paths)
dtwvis.plot_warpingpaths(s1b, s2b, paths, best_path)

image