processing / processing-sound

Audio library for Processing built with JSyn
https://processing.org/reference/libraries/sound/
GNU Lesser General Public License v2.1
149 stars 50 forks source link

Make FFT.analyzeSample() correctly analyze samples which aren't a power of 2 #104

Open tristanbay opened 5 months ago

tristanbay commented 5 months ago

I dug into the Processing source code to figure out why this is happening, and it apparently thinks that powers of 2 aren't powers of 2 when passing into the assert line in the numBits function of the FourierMath class in jsyn's softsynth code, even though it gets past the check in this function.

Here's the code I'm trying to run (just copy and paste it into the IDE, version 4.3). Even adapting and printing out the PowerOf2 & (PowerOf2 - 1) expression in my own code gives a 0 on the console:

import processing.sound.*;
SoundFile drums;
int bandcount = 512, vrate = 60, voverlap = 2, ch = 2;
int vframelen, videoframes;
float[] drums_sample, drums_l, drums_r, vtemp_l, vtemp_r;
float[][] drums_bands_l, drums_bands_r;

void setup()
{
  size(1920, 1080, P2D);
  drums = new SoundFile(this, "drums.wav");
  drums_sample = new float[drums.frames() * ch];
  vframelen = 44100 / vrate; // number of audio frames in video frame per channel
  drums_l = new float[(ceil((float)drums.frames() / vrate) * vrate) + (vframelen * voverlap)];
  drums_r = new float[(ceil((float)drums.frames() / vrate) * vrate) + (vframelen * voverlap)];
  vtemp_l = new float[vframelen * (voverlap + 1)];
  vtemp_r = new float[vframelen * (voverlap + 1)];
  videoframes = floor((float)drums.frames() / vframelen); // number of frames of video in sample
  drums_bands_l = new float[videoframes][bandcount];
  drums_bands_r = new float[videoframes][bandcount];
  drums.read(drums_sample);
  for (int i = 0; i < drums.frames(); i++) {
    drums_l[i] = drums_sample[i * 2];
    drums_r[i] = drums_sample[(i * 2) + 1];
  }
  for (int i = 0; i < videoframes; i++) {
    for (int j = 0; j < vtemp_l.length; j++) {
      vtemp_l[j] = drums_l[j + (vframelen * i)];
      vtemp_r[j] = drums_r[j + (vframelen * i)];
    }
    println(drums_bands_l[i].length & (drums_bands_l[i].length - 1));
    FFT.analyzeSample(vtemp_l, drums_bands_l[i]);
    FFT.analyzeSample(vtemp_r, drums_bands_r[i]);
  }
  background(255);
  loadPixels();
  for (int x = 0; x < width; x++) {
    for (int y = 0; y < height; y++) {
      pixels[(width * y) + x] = color(map(drums_bands_l[floor(map(x, 0, width, 0, videoframes))][floor(map(y, 0, height, 0, bandcount))], 0, 1, 255, 0));
    }
  }
  updatePixels();
}

Something seems to be altering the perceived length of those target arrays between checks. Could it have to do with using 2D arrays?

kevinstadler commented 5 months ago

Thanks for reporting! I had a look at the code and I think the issue is not with the length of the target array, but with that of the sample array which, at least based on the current implementation of FFT.analyzeSample(), also needs to be a power of two (but not necessarily the same power of 2 as the target array??). The culprit is the second argument here, which I'm not sure is actually the correct way to call transform() (it should probably be target.length instead of sample.length):

https://github.com/processing/processing-sound/blob/23108c9a201724c6e231b63d1fc631809462568a/src/processing/sound/FFT.java#L138-L139

For now, if you change the length of vtemp_l to the next-lowest power of two of the length you actually want (vframelen * (voverlap + 1)) it should work. (If you get an ArrayIndexOutOfBoundException then the sample actually needs to be the same length as the target array. Which would make sense in the context of an FFT, but in that case the documentation of analyzeSample() needs some serious updating to reflect that!

tristanbay commented 5 months ago

The culprit is the second argument here, which I'm not sure is actually the correct way to call transform() (it should probably be target.length instead of sample.length):

https://github.com/processing/processing-sound/blob/23108c9a201724c6e231b63d1fc631809462568a/src/processing/sound/FFT.java#L138-L139

I think that's the issue. 🤦‍♂️ This should be changed as soon as possible.

For now, if you change the length of vtemp_l to the next-lowest power of two of the length you actually want (vframelen * (voverlap + 1)) it should work. (If you get an ArrayIndexOutOfBoundException then the sample actually needs to be the same length as the target array. Which would make sense in the context of an FFT, but in that case the documentation of analyzeSample() needs some serious updating to reflect that!

Seems to work. I guess one of the devs got those arrays mixed up.

kevinstadler commented 5 months ago

Thanks for the pull request, I'll merge it for now but it introduces another quirk: after the change, analyzeSample() only analyzes the first numBands samples, even if the sample array is actually longer. For example in your sketch, assuming usual framerates of 44khz and 25fps respectively, your vtemp arrays have a length of ~1764 (actually a bit more because of the overlap), but the FFT result you get back is only based on the first 512 samples of that array, which is probably not the expected behaviour. Ideally we should use something like Welch's method and return the average FFT of all numBands-length sliding windows that fit into the sample array.

The analyzeSample() method current also doesn't apply a Hann window before analysis like the continuous-analysis of FFT objects does: https://github.com/processing/processing-sound/blob/23108c9a201724c6e231b63d1fc631809462568a/src/processing/sound/JSynFFT.java#L39-L41 The application of such a window should probably be moved to the FFT.calculateMagnitudesFromSample() method anyway (I've got a hunch that the JSynFFT class should be ditched altogether, actually the only instance variable necessary is the FixedMonoWriter with its buffer, all the arrays used for computations are never concurrent and can therefore be shared between instances).