The job of Part 1, project 7 is to write code to calculate distances from vantage points which you can then use to do similarity search, and put these distances in a (non-balanced) Binary Search Tree database (on the lines of what we did in lab).
Assume that your time series has periodic boundary conditions with equal spacing on [0,1]. We have provided two example time series:
169975.dat_folded, 51886.dat_folded
Interpolate these from 0.01 to 0.99 with 1024 points to set up a regular sampling.
For this part you can use anything that follows the sized-time-series interface.
Standardize the time series (subtract the mean and divide by the standard deviation)
Calculate the cross-correlation
Compute the kernelized cross-correlation as talked so that we can get a real distance. The equation for the kernelized cross correlation is given at http://www.cs.tufts.edu/~roni/PUB/ecml09-tskernels.pdf . Normalize the kernel there by $\sqrt(K(x,x)K(y,y))$ so that the correlation of a time series with itself is 1.
The job of Part 1, project 7 is to write code to calculate distances from vantage points which you can then use to do similarity search, and put these distances in a (non-balanced) Binary Search Tree database (on the lines of what we did in lab).
Assume that your time series has periodic boundary conditions with equal spacing on [0,1]. We have provided two example time series: 169975.dat_folded, 51886.dat_folded
Interpolate these from 0.01 to 0.99 with 1024 points to set up a regular sampling.
For this part you can use anything that follows the sized-time-series interface.