[Feature Request] Constant-Q Transform, custom FFT and perceptual frequency scales

hvianna / audioMotion-analyzer

High-resolution real-time graphic audio spectrum analyzer JavaScript module with no dependencies.

https://audioMotion.dev

GNU Affero General Public License v3.0

613 stars 62 forks source link

[Feature Request] Constant-Q Transform, custom FFT and perceptual frequency scales #30

Open TF3RDL opened 2 years ago

TF3RDL commented 2 years ago

Although FFTs are fine, it gets really boring for me, so the constant-Q transform (actually the variable-Q transform) is preferred over FFT for octave band analysis, but my implementation of CQT (implemented using bunch of Goertzel algorithm) is slow and it needs to use a sliding DFT to do the real-time CQT

I also aware that spectrum analyzers on Web Audio API doesn't need to use AnalyserNode.getByteFrequencyData, you can just use any FFT library and getFloatTimeDomainData as an input just like my sketch does that, but beware you need to window it using Hann window or something before using FFT, see #3

I think perceptual frequency scales like Mel and Bark should be added because the bass frequencies are less shown than logarithmic scale and more shown than linear scale

hvianna commented 2 years ago

Thank you for letting me know about these techniques! Looks like I have a lot to catch on! 😅

Also, thank you for sharing your sketch! It made me realize that using linear values for the amplitude (instead of dB) makes a huge difference in visualization. I'll have this added as an option in the next release. Next, I think weighting filters would also be a good addition.

Can you recommend any good references for equations/algorithms of the CQT/variable-Q transform, perceptual scales and weighting filters?

Cheers!

TF3RDL commented 2 years ago

The equation for Bark scale is from Traunmüller's work, and the A-weighting as well as other things is already covered on Wikipedia

As for the constant-Q transform, I prefer the sliding DFT, which works best for real-time audio visualization and it even has a paper for it

TF3RDL commented 1 year ago

Here's the problem that I realized before you implementing the CQT; the Brown-Puckette would require real/imag parts, which AnalyserNode doesn't have (as getByteFrequencyData/getFloatFrequencyData only outputs logarithmic magnitude values), thus it requires custom FFT functionality (which can be implemented using any FFT libraries including ones like this that bundled with FFT functions), and implementing the sliding CQT requires AudioWorklets since it doesn't work well with getFloatTimeDomainData as waveform data to process

hvianna commented 1 year ago

@TF3RDL Thanks for following up on this!

For the next beta release, I've done some improvement to the linear amplitude mode and I'm finishing up the work on the weighting filters. I'll try to take a look at the perceptual scales next.

TF3RDL commented 1 year ago

As for the custom FFT, this could allow non-power of two sizes, zero-padding, and use different FFT streams or even non-audio data as an input (as custom FFT doesn't depend on Web Audio API), not just window functions right?

Not sure about the performance impact of using custom FFT over getByteFrequencyData/getFloatFrequencyData, but I do know that non-power of two FFTs are noticeably slower

TF3RDL commented 5 months ago

Of course, analog-style analyzer (IIR filter bank, no FFT required) mode might be better to implement performance-wise though I think it works best if you implemented this type of non-FFT analyzer using custom implementation (using AudioWorklets), rather than using bunch of BiquadFilterNodes connecting to each AnalyserNodes

hvianna commented 4 months ago

I need to work on making the rendering function more independent of WebAudio / FFT, but I worry that a generic solution might impact performance.

By the way, I really like the idea of fading peaks in your demo. I'll try adding these next!