PyWavelets / pywt

PyWavelets - Wavelet Transforms in Python
http://pywavelets.readthedocs.org
MIT License
2.03k stars 469 forks source link

exact meaning of 'periodization' extension mode #329

Closed macelee closed 6 years ago

macelee commented 6 years ago

From the documentation I understand that the 'periodization' extension mode "gives the smallest possible number of decomposition coefficients". Can anyone further explain this please? In my experiment I get some interesting results:

x=[1,3,5,7,6,4,5,2] import pywt (a, d) = pywt.dwt(x,'db5','periodic') a array([ 5.35218198, 8.6702098 , 7.21732633, 2.09480567, 5.35218198, 8.6702098 , 7.21732633, 2.09480567]) (a, d) = pywt.dwt(x,'db5','periodization') a array([ 7.21732633, 2.09480567, 5.35218198, 8.6702098 ]) (a, d) = pywt.dwt(x,'db6','periodic') a array([ 1.74398886, 6.21015721, 8.44740984, 6.93296787, 1.74398886, 6.21015721, 8.44740984, 6.93296787, 1.74398886]) (a, d) = pywt.dwt(x,'db6','periodization') a array([ 7.97311013, 4.14173887, 2.65886058, 8.5608142 ])

As can be seen from my test above, when using the 'db5' filters, the 'periodic' mode gives some redundant coefficients. The duplicated values are removed from the output of 'periodization' mode which looks perfect (although the order of the values are not the same). However when I try some other filters such as 'db6', I can see that the coefficients given by 'periodization' mode are completely different values. Are these numbers some sort of linear combination of the numbers given by 'periodic' mode? Can anyone explain this behaviour please? Thanks!

grlee77 commented 6 years ago

There are two potential sources of difference, both of which are subtle implementation details. The first is that periodization pads odd-length inputs up to the nearest even-length signal by replicating the last value. For example, the following two transforms give the same coefficients:

import pywt
pywt.dwt([1, 2, 3], wavelet='db1', mode='periodization')
# (array([ 2.12132034,  4.24264069]), array([-0.70710678,  0.        ]))
pywt.dwt([1, 2, 3, 3], wavelet='db1', mode='periodization')
# (array([ 2.12132034,  4.24264069]), array([-0.70710678,  0.        ]))

So, for most odd-length signals periodization and periodic should not be expected to give the same coefficients!

In your case, your input signal is even and the difference comes from a different source. For even signals you are seeing a difference that depends on the length of the digital filters. Recall that the DWT is equivalent to a downsampled convolution. Effectively, the C code that does the convolutions skips every second input sample. For all modes except periodization, the convention in PyWavelets (and Matlab) is to convolve only the samples of the input occuring at odd indices [1, 3, 5, ...]. However for periodization, the convolution has a starting index that may be either odd or even depending on the filter length.

For periodization, the C code determines a starting index for the convolution based on 1/2 the discrete filter length. i.e.:

db5 = pywt.Wavelet('db5')
starting_index_for_convolution = db5.dec_len // 2

For db5 starting_index_for_convolution is 5 which is an odd sample so you see a result matching a subset of the coefficients from periodic. For db6, starting_index_for_convolution is 6 which is even so you do not get exactly the same coefficients.

If you circularly shift the coefficients in your example by 1 for the db6 case you can see the match you were expecting again:

import pywt
import numpy as np
x=[1,3,5,7,6,4,5,2]

(a, d) = pywt.dwt(x,'db6','periodic')
print(a)
# [ 1.74398886  6.21015721  8.44740984  6.93296787  1.74398886  6.21015721
  8.44740984  6.93296787  1.74398886]

(a, d) = pywt.dwt(np.roll(x, 1),'db6','periodization')
print(a)
# [ 8.44740984  6.93296787  1.74398886  6.21015721]