edouardoyallon / scatwave

ScatWave is a Torch implementation of scattering using CUDA
http://www.di.ens.fr/~oyallon/
21 stars 5 forks source link

Normalisation after periodisation? #8

Closed edouardoyallon closed 8 years ago

edouardoyallon commented 9 years ago

Hi,

https://github.com/scatnet/scatnet/blob/master/convolution/conv_sub_1d.m > after periodization is normalized as 2^ds https://github.com/scatnet/scatnet/blob/master/convolution/conv_sub_1d.m > here 2^(ds/2)

To me, this does not make sens, since it does not preserve the downsampling. Indeed, if y(t)=x(2t), then \hat y(\omega)=\frac{\hat x(\omega+\pi)+\hat x(\omega)}{2}. Where am I incorrect? I validated numerically that there should be no square root!(or am I wrong? cf unit_test.la)

Thanks!

edouardoyallon commented 9 years ago

Just to be sure I'm consistant, if you take a 1D signal, like {1 ... 1}\in \mathbb R ^n, it's fourier transform is {n 0... 0}, and if you periodize it by 2^ds, you will get {n 0 ...0}\in \mathbb R^(n/2^ds). Its ifft is then {2^ds ... 2^ds}\in \mathbb R^(n/2^ds), isn't it? Thus the normalisation must be 1/2^ds! Where am I incorrect? Thanks!

lostanlen commented 9 years ago

You are not incorrect. Since

  1. our wavelets are normalized in l1 norm, i.e. they have equal peaks in the Fourier domain,
  2. the FFT does not have a normalization factor ; yet the IFFT divides the result of the RFFT (reverse, i.e. conjugate, Fourier transform) by the number of coefficients, it is necessary to divide by the downsampling factor. No square root.

Naming: I advocate log2_ds instead of ds for the variable you mention. The invariant here is that log2_ds is an integer. "ds" is confusing since the actual downsampling factor is your 2^ds. If I had a dollar for every user of ScatNet who stumbled upon this...

Performance: It would be faster to call rfft (not exposed in MATLAB, but part of the FFTW internal routines) to have one pointwise multiplication instead of two. So instead of

  1. calling IFFT (which calls RFFT and divides by N)
  2. multiplying by 2^(log2_ds) we
  3. call RFFT inplace
  4. multiply by 2^(log2_ds - log2_N). This multiplication corresponds to a bitwise right shift operator of (log2_N - log2_ds) bits, which is blazing fast. Call bit32.rshift(x, log2_N - log2_ds) in Lua >5.2 for the dimension 1.

Then it might not be inplace, so we'll have to monitor memory allocation. You are more knowledgeable in Lua than me to address this last performance issue.

edouardoyallon commented 9 years ago

Thanks. I do not get RFFT, is it removing the phase on its own? Do you have some documentation? By the way, it seems it is faster to use only one 2D FFT instead of two 1D FFT, when initialized with the correct plans; in addition, cuFFT does not handle more than 3D transform on a batch of 3D tensor. (whereas FFTW does) Consequently, I'm going to build one function per type of transform!

lostanlen commented 9 years ago

I have to correct myself. RFFT stands for "real fft". I was talking about BFFT, i.e. "backward FFT".

julia> help(bfft)
Base.bfft(A[, dims])

   Similar to "ifft()", but computes an unnormalized inverse
   (backward) transform, which must be divided by the product of the
   sizes of the transformed dimensions in order to obtain the inverse.
   (This is slightly more efficient than "ifft()" because it omits a
   scaling step, which in some applications can be combined with other
   computational steps elsewhere.)

cuFFT's analogue to FFTW's bfft is

cufftExecC2C(plan, data, data, CUFFT_INVERSE)
edouardoyallon commented 8 years ago

Here again: https://github.com/scatnet/scatnet/blob/master/filters/periodize_filter.m I took this file in scatnetlight also. This is supposed to periodize the filters, which means, if I'm correct, that this slice of code downsample them. How comes there is again no normalization(the mask is set up with 1 values..)? I guess in audio you do not use this part of the code, because here, it clearly affects the results when combining signals at different samplings...