WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.05k stars 167 forks source link

Proper FFT / IFFT missing #248

Closed kickermeister closed 9 years ago

kickermeister commented 11 years ago

I wonder why you didn't implemented a proper FFT and IFFT node for the Web Audio API. This would be definitively a great benefit for advanced audio processing within web browsers.. As far as I've understand the code, you are nearly there with the AnalyserNode which obviously conducts a FFT. But this node doesn't provide complex numbers as one would need for audio processing in frequency domain. Eventually, the possibility to transform the processed buffer back to the time domain with a proper IFFT would be also extremely helpful!

Any chance to implement this feature(s) in a future version?

opera-mage commented 11 years ago

Hi, not sure if I'm speaking for the entire group here, but I think that your feature request is another sign of how the current version of the Web Audio API can't possibly satisfy every need. There have been many other feature requests with similar symtoms (i.e. "it's almost there, why not expose it 100%?"), for instance: the ability to define more customized filters (first order, second, ... order IIR filters, with custom filter coefficients, etc), the ability to control the HRTF transfer function in more detail, the ability to control the dynamics compressor node in more detail, etc, etc.

I personally have high hopes for making the ScriptProcessorNode a first class citizen, with Web Worker based processing etc, which would enable all of these (and more) features to be implemented just the way you wish. I think that most of us here agree that improving the ScriptProcessorNode is an important change to the Web Audio API, but it will most likely be part of a later version of the API (at least that way we can focus on getting it right instead of rushing something out that's not really finished).

As for the feared lesser processing power of JavaScript; I don't really think that it will be a problem.

(btw - sorry for cross-posting...)

kickermeister commented 11 years ago

I completely understand that you cannot implement every feature request, but don't you think that a FFT / IFFT is a very basic and important tool for audio processing? At least from my perspective as someone who wants to implement some advanced audio processing in web browsers for fancy web applications, this is an important feature missing in your toolbox.

Of course, I could also use the ScriptProcessorNode to apply a FFT and IFFT (I'm doing this actually by using https://github.com/corbanbrook/dsp.js/) but I'm having (still) some problems which make it a hackish workaround from my feeling.

cwilso commented 11 years ago

I'm also not going to claim to represent the entire group. But to me, the current version of the Web Audio API wasn't designed to satisfy every need (though it does have an escape hatch in ScriptProcessor). For example, I find the BiquadFilter far easier to use and manipulate than a generic FFT/IFFT node, and it led me to writing a bunch of samples that I simply never would have attempted if I first had to fully understand FFTs. The goal was to start enabling audio for HTML gaming, music production applications, and the like.

I do think exposing FFT/IFFT nodes is a good feature for a future version; I also think what we have is a good layer to expose. HRTF controls sound pretty advanced, given the relatively low usage of Panner and HRTF today. The DynamicsProcessor, I have to agree, needs some work, even in V1; and ScriptProcessor is a problem (though an ongoing conversation with the TAG). I'm not totally convinced just providing a Web Worker based solution fixes the problem, though it does help.

Again, just my opinion.

On Tue, Oct 8, 2013 at 5:04 AM, kickermeister notifications@github.comwrote:

I completely understand that you cannot implement every feature request, but don't you think that a FFT / IFFT is a very basic and important tool for audio processing? At least from my perspective as someone who wants to implement some advanced audio processing in web browsers for fancy web applications, this is an important feature missing in your toolbox.

Of course, I could also use the ScriptProcessorNode to apply a FFT and IFFT (I'm doing this actually by using https://github.com/corbanbrook/dsp.js/) but I'm having (still) some problems which make it a hackish workaround from my feeling.

— Reply to this email directly or view it on GitHubhttps://github.com/WebAudio/web-audio-api/issues/248#issuecomment-25883917 .

ghost commented 11 years ago

On Tue, Oct 8, 2013 at 2:59 AM, kickermeister notifications@github.comwrote:

I wonder why you didn't implemented a proper FFT and IFFT node for the Web Audio API. This would be definitively a great benefit for advanced audio processing within web browsers.. As far as I've understand the code, you are nearly there with the AnalyserNode which obviously conducts a FFT. But this node doesn't provide complex numbers as one would need for audio processing in frequency domain. Eventually, the possibility to transform the processed buffer back to the time domain with a proper IFFT would be also extremely helpful!

I'm curious to know what you would do if you had the complex FFT output. For my limited imagination, all I can see is having a ScriptProcessorNode taking the FFT output, manipulating it in some way, and sending it out to an IFFT block to get back an audio signal. At that point, the ScriptProcessorNode could do the FFT/IFFT itself, perhaps somewhat more slowly.

And don't forget that an FFT block can do the IFFT if you feed it the conjugate of the transformed data. More or less. :-)

opera-mage commented 11 years ago

Just a brief comment: Of course I meant that the current version of the API wasn't meant to cover every need. I think that's a good thing (first-things-first principle).

Also, I think that FFT/IFFT is a very useful tool in audio processing, but as pointed out in a previous comment, I don't really see how you'd treat the frequency domain signal in a sane way in the node graph. I think you'd want to either use the data in a ScriptProcessor, or we'd probably have to re-think the node graph functionality so that it could support frequency domain signals in a transparent way. Or are there any other ideas of how it could be integrated into the API?

russellmcc commented 11 years ago

As a developer, I fully understand that the current version of the API is rather large and the group probably has their hands full fleshing out the specification for the existing components.

However, I do consider having better support for spectral processing to be the biggest "hole" to fill in a version 2 (well, perhaps second to web worker script processors). I don't like the idea of having spectral data flowing around the graph - it seems too complicated. Instead, what would be most convenient for me would be a "Spectral Processor Node", which behaves exactly like the script processor node, except an FFT happens before and an IFFT happens after. There would be explicit controls for analysis window size, window shape, resynthesis window shape and window overlap. Any "spectral graph" based processing would have to be handled fully by the client in javascript. I just don't see spectral graphs being a common enough case for me to warrant the giant increase in complexity and mental overhead.

kickermeister commented 11 years ago

I fully agree with russellmcc.

Maybe my post was a bit confusing - a "SpectralProcessorNode" is exactly which was in my mind when I wrote the original post.. :)

opera-mage commented 11 years ago

The SpectralProcessorNode, as you describe it, is definitely an option. However, I don't see much benefit in making a special node. IMO, you could implement the necessary windowing & overlap options/operations in JavaScript, and still use the ScriptProcessorNode.

jmvalin commented 11 years ago

The FFT is useful beyond just the use case of windowing and overlap-add. You can use an FFT to implement any kind of linear filtering operation (convolution) real fast, perform cross-correlations (e.g. periodicity searches), and do all kinds of frequency analysis. It's definitely (and by far) the most useful of all DSP operations you can include here.

opera-mage commented 11 years ago

jmvalin, I agree. I'm just saying that I don't really see the point of having a separate node for it. It could just be handled in the ScriptProcessorNode, especially since most frequency domain processing that you might want to do would have to be implemented in JS anyway.

ghost commented 11 years ago

On Mon, Oct 14, 2013 at 8:48 PM, Jean-Marc Valin notifications@github.comwrote:

The FFT is useful beyond just the use case of windowing and overlap-add. You can use an FFT to implement any kind of linear filtering operation (convolution) real fast, perform cross-correlations (e.g. periodicity searches), and do all kinds of frequency analysis. It's definitely (and by far) the most useful of all DSP operations you can include here.

The ConvolverNode can do convolution and cross-correlation for you.

— Reply to this email directly or view it on GitHubhttps://github.com/WebAudio/web-audio-api/issues/248#issuecomment-26307484 .

hughrawlinson commented 10 years ago

:+1: This would be very useful for certain feature extraction algorithms.

joeberkovitz commented 9 years ago

WG feels that #468 is doing a better job of carrying the use cases expressed here forward. Closing.

linolinco commented 3 years ago

On Mon, Oct 14, 2013 at 8:48 PM, Jean-Marc Valin notifications@github.comwrote:

The ConvolverNode can do convolution and cross-correlation for you.

— Reply to this email directly or view it on GitHubhttps://github.com//issues/248#issuecomment-26307484 .

(7 years late to the party, but I just have to throw in my 2c here)

This is backwards. You have at least 3 places already in WebAudio where an FFT/IFFT is used internally:

  1. AnalyserNode performs an FFT (but exposes only the real coefficients, for no reason that I can see),
  2. createPeriodicWave performs an IFFT (but perhaps not well optimised / on the audio render thread?),
  3. ConvolverNode (if implemented properly) performs 2 FFTs, a multiplication, and an IFFT (but is monolithic).

That's 3 use cases already within your own API, but somehow FFT/IFFT is not an important fundamental audio operation? Any decent WebAudio implementation will need an FFT/IFFT with windowing & overlap to implement ConvolverNode well - so why not expose it in the API so we can all use it?

Yes of course I can throw in my own userland FFT on top of the 1-3 already existing internal implementations. However, one key motivation for writing a Web App is platform independent code, and a platform-independent generic FFT will always be very suboptimal. Have a look at the fftw project to see how much optimisation potential there is; the FFT should be part of WebAudio so that this potential can be unlocked.

Had to get this off my chest. Was planning to port my professional audio application to WebAudio, but after discovering that as of late 2020 it still doesn't expose a proper FFT I'm rather put off. I don't think I've ever before seen an audio API without FFT in my life, have to say that the lack of it here makes me doubt the expertise of those in charge.

padenot commented 3 years ago

The current model doesn't lend itself well to having frequency-domain data between AudioNodes: everything is time domain.

There's been a request for a performance comparison between WASM+SIMD and optimized native code, to understand if it would be worth it to expose an FFT API that would be implemented in the browser, but nobody answered. In any case, having FFT code backed by native code would be an API on its own, and, while useful for audio, should be developed outside the audio working group, so that it's useful for other domains.

Nobody has ever opposed this, lots of people have requested it, but lots of people request lots of things, and it's just a few people working on this, choices need to be made. In light of the fact that performance numbers looked adequate (not amazing, adequate) for what people wanted to do, other things were worked on.

Had to get this off my chest. Was planning to port my professional audio application to WebAudio, but after discovering that as of late 2020 it still doesn't expose a proper FFT I'm rather put off. I don't think I've ever before seen an audio API without FFT in my life, have to say that the lack of it here makes me doubt the expertise of those in charge.

"Those in charge" have expertise, but lack time and have inherited a spec that was shipped in Chrome without proper review, and have strived to improve it without breaking existing apps and backwards compatibility, with the available resources. Insulting them in a public bug tracker is not exactly the best way to request a feature, which would be to start a discussion in an issue in https://github.com/WebAudio/web-audio-api-v2/issues.

linolinco commented 3 years ago

Look, the key frustration here is that we all know that the complex-valued FFT result is already being computed in (among others) AnalyserNode. Regardless of its shortcomings, just adding a getComplexFrequencyData() accessor to AnalyserNode would be trivial, 100% backwards & forwards compatible, and already useful. Keep it simple: treat this as a bug report about a missing accessor. It surely cannot be reasonable for fundamentalist discussions about the API’s overall design and purpose, and the role for frequency-domain data in it (or not), to hold up such a simple fix for 7 years?

Yes, the next thing one would then like to see in AnalyserNode is windowing, and overlap-add, and finally re-synthesis - but these are discussions for another day. One step at a time.

rtoy commented 3 years ago

Adding getComplexFrequencyData() is acceptable to me.

Can you open a new issue (on v2) for your thoughts on this? We are not really accepting anything new in v1. v2 will have all the new things and any updates we want or need to make for v1.

linolinco commented 3 years ago

Okay, will do. Thanks!

PS: I see that windowing has already made it into AnalyserNode.