processing / processing-sound

Audio library for Processing built with JSyn
https://processing.org/reference/libraries/sound/
GNU Lesser General Public License v2.1
149 stars 50 forks source link

Add a function to calculate the fft of a given sample array. #82

Closed damaru-inc closed 1 year ago

damaru-inc commented 1 year ago

This addresses #77. When using processing to generate images to stitch into an animation, the present fft function presents timing issues.

This addresses those by allowing the client to pass in a portion of their AudioSample for fft processing, so that one can process the correct part of the audio for each video frame.

Here is an example of how to use it inside a draw() function, assuming a mono source:

int startSample = frameNo * samplesPerFrame;
float[] sampleForThisFrame = new float[samplesPerFrame];
soundFile.read(startSample, sampleForThisFrame, 0, samplesPerFrame);
float[] fftMagnitudes = fft.analyzeSample(sampleForThisFrame, 64);

For stereo, one could extract the left and right channels from the soundFile and process them with fft separately.

damaru-inc commented 1 year ago

If this is approved, I'll update the English documentation too.

kevinstadler commented 1 year ago

That looks amazing, thank you very much!

What English documentation do you mean? I think with your Javadoc documentation of the new method, both the Javadoc as well as the library reference on the Processing website should get updated automatically.

Would you happen to have some example sketch code that demonstrates a minimal use case for the new method?

damaru-inc commented 1 year ago

I'm glad you appreciate it!

For documentation, I was referring to this page: https://processing.org/reference/libraries/sound/FFT.html I see that there's a Spanish version in the repo, I wanted to make it clear that I wasn't going to edit that!

I do have example code. I spent a couple of weeks learning jsyn, and I wrote some small programs that generate audio files to test with. jsyn is great for that!

Then I wrote a Processing sketch that's based in the one in the link above, except that it displays a series of horizontal bars going out left and right from the centre, and I split the left and right audio signals. I'll clean up that code and put a link to its repo tomorrow, and then we can see if we want to incorporate any of that as example code anywhere.

Finally, I learned a couple of things along the way that raise questions in my mind about the existing code, maybe we could connect via email or Slack (I've got an org there) or something?

damaru-inc commented 1 year ago

I've put a sample up here: https://github.com/damaru-inc/processing-hacks

kevinstadler commented 1 year ago

For documentation, I was referring to this page: https://processing.org/reference/libraries/sound/FFT.html I see that there's a Spanish version in the repo, I wanted to make it clear that I wasn't going to edit that!

Don't worry about that page, it will be automatically generated (based on your Javadoc comments in the source code) as soon as I merge the pull request!

I've put a sample up here: https://github.com/damaru-inc/processing-hacks

This is great, thanks! I will probably sit down and add a dedicated minimal example sketch for absolute beginners to the library before the next release.

Finally, I learned a couple of things along the way that raise questions in my mind about the existing code, maybe we could connect via email or Slack (I've got an org there) or something?

If it's related to the library source code then Github is probably the best place to keep it! If it's not related to the FFT class, feel free to just open another issue...

damaru-inc commented 1 year ago

Okay, here are my questions. In JSynFFT.calculateMagnitutes, you multiply the fft results by 2 when copying them to the target array. Why is that? With what I was working on, the higher magnitude values were just under 1.0, isn't that the range we want?

Also, if I ask for, say, 32 bands of fft, the top 16 bands seem to be a mirror image of the lower ones. In a couple of youtube videos I watched, they discarded the top half, and explained that the top half contains values that are negative coefficients or something. So that's why I didn't display them in my example. Do you know anything about that?

For example, in this video, we see something like what we're doing but in matlab. Line 22 has the comment "Only plot the first half of freqs": https://youtu.be/c249W6uc7ho?t=481

damaru-inc commented 1 year ago

Do you need anything more from me?

kevinstadler commented 1 year ago

Only just getting round to packaging up the new release, sorry about that, but thanks a lot again for your work, it is really appreciated!

Regarding your questions: yes so technically the Fourier transform gives you the frequency magnitudes at positive and negative frequencies (positive ones are returned first by convention), the 'negative' ones are simply the mirror of the positive ones because the input (audio) signal only has real values, no complex ones. The library hides the quirky math from the user who is only interested in the effective frequency spectrum analysis. So in order to get, say, 512 usable frequency bands, the FFT class actually runs a 1024 sample FFT underneath, then only returns the first 512 entries of the FFT.

In my memory the magnitudes sum to 1 over all the frequency bands, but since the upper half of the bands are discarded before being returned I multiplied by 2 to get back to a sum of 1 (compare for example here). I will have another look at the exact behaviour of the underlying FFT function to make sure that the live- and non-live versions of the FFT match up!

damaru-inc commented 1 year ago

Thanks Kevin, it felt good to contribute!