d-chambers / Detex

A Python package for subspace detection and waveform similarity clustering
Other
32 stars 6 forks source link

Failed alignment due to similar end effects #24

Closed d-chambers closed 8 years ago

d-chambers commented 8 years ago

The current function in the construct model for calculating waveform correlation coefficients (_CCX2) works by taking one of the two waveforms (each of length n) and zero padding n elements to the beginning and n elements to the end to the waveform. Conceptually, the other waveform is then slide over the zero padded waveform and the CC is calculated at every time step. This can allow similar parts of the end of one waveform and the beginning of another (such as filter effects) to produce the highest correlation coefficient in the cc trace, even though only a few samples are actually similar, as is the case in the plots shown in issue 19. When this happens it breaks the alignment algorithm in the createSubSpace call.

In order to remedy this the waveform the waveform to be padded will only receive n/2 zero elements at the beginning and n/2 zero elements at the end.

edit : The padding had to remain n on both sides or else the normalization gets screwed up, but the correlation coefficient vector is now sliced before determining its max to the n/2 bound, essentially doing the same thing.

d-chambers commented 8 years ago

Here is another example where the top plot is the end of one waveform and the bottom is a the beginning of another. The rest of the waveforms were not similar but a cc of 0.6 was returned when only these two sections overlapped. When the remedy mentioned above was implemented (will be included in version 1.0.6) the max CC value was less than .1 (as expected).

image

image