slac207 / cs207project

MIT License
0 stars 4 forks source link

P7-Write code to calculate distances from two points #57

Closed cocochrane closed 7 years ago

cocochrane commented 7 years ago

The job of Part 1, project 7 is to write code to calculate distances from vantage points which you can then use to do similarity search, and put these distances in a (non-balanced) Binary Search Tree database (on the lines of what we did in lab).

Assume that your time series has periodic boundary conditions with equal spacing on [0,1]. We have provided two example time series: 169975.dat_folded, 51886.dat_folded

Interpolate these from 0.01 to 0.99 with 1024 points to set up a regular sampling.

For this part you can use anything that follows the sized-time-series interface.

  1. Standardize the time series (subtract the mean and divide by the standard deviation)
  2. Calculate the cross-correlation
  3. Compute the kernelized cross-correlation as talked so that we can get a real distance. The equation for the kernelized cross correlation is given at http://www.cs.tufts.edu/~roni/PUB/ecml09-tskernels.pdf . Normalize the kernel there by $\sqrt(K(x,x)K(y,y))$ so that the correlation of a time series with itself is 1.
cocochrane commented 7 years ago

Finished