wannesm / dtaidistance

Time series distances: Dynamic Time Warping (fast DTW implementation in C)
Other
1.08k stars 184 forks source link

preprocessing.differencing fails #169

Open Ne-oL opened 2 years ago

Ne-oL commented 2 years ago

i have been trying to apply preprocessing.differencing to my z-score normalized series but it keeps failing. i have no idea what the cause. here is an example of a failing code:

test=[1,2,3,4,5,6,7,8,9,10] test2 = preprocessing.differencing(test, smooth=0.1)

running it gives me this error:

AxisError: axis 1 is out of bounds for array of dimension 1

and if i tried to change the shape of the series to (10,1), i get the error:

ValueError: The length of the input vector x must be greater than padlen, which is 9.

Ne-oL commented 2 years ago

I have found a similar [issue] (https://github.com/wannesm/dtaidistance/issues/154#issue-1121655512) closed in February inquiring about the same issue. the solution provided by @wannesm is to put the array within an array. so this works based on my experience now: test=[[1,2,3,4,5,6,7,8,9,10]]

I would like to keep this issue open until this is rectified as the method should take 1D array. also if possible, please correct the example mentioned in the documentation discussing this, as it uses a = np.array([0.1, 0.3, 0.2, 0.1]) which doesn't work.

another note I wanted to mention, the differencing returns an array that has a length n-1. which I wasn't able to reassign into its old array due to the difference in shape. since I'm using it for clustering, the last element wouldn't make a significant difference (at least in my use case). so my suggestion is based on the workaround I just did to make it work. would it be possible to pass an argument that says keep_shape=True for example, and this would fill the last missing spot with the mean of the calculated array before returning it?