WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.06k stars 168 forks source link

AudioFrequencyWorkerNode for working in the Frequency Domain #468

Closed hughrawlinson closed 5 years ago

hughrawlinson commented 9 years ago

At the moment, ScriptProcessorNodes and AudioWorkers are operating on the time domain buffer data. At the Web Audio Conference, it seems like there's demand for frequency domain data inside a callback that's going to get called for every audio frame.

We're thus proposing an new node called an AudioFrequencyWorkerNode, which gives the developer the option to obtain audio data, perform processing, and output in the frequency domain. This involves passing an options object to the createAudioFrequencyWorker method, specifying the input and output types.

Defaults

The AudioFrequencyWorkerNode should allow access to time domain and frequency domain data concurrently. If no options object is passed to createAudioFrequencyWorker, the input type would default to the amplitude/phase pair, as would the output. However, the options object would allow the user to choose between amplitude/phase, real/imaginary, and time domain data. The dataOut would default to the same as the dataIn, but could be set to a different data type, in case the user wants to read real/imaginary pairs in, and write out to the time domain for example.

Proposed processing structure of the AudioFrequencyWorkerNode

INPUT (time-domain) ↓ windowing ↓ FFT ↓ ~ ~ ~ ~ ~ ~ ~ ~ ~ dataIn ↓ onaudioprocess ↓ dataOut ~ ~ ~ ~ ~ ~ ~ ~ ~ ↓ mirror ↓ complete data ↓ IFFT ↓ windowing ↓ OUTPUT (time-domain)

Example code:

main JS

var aw = createAudioFrequencyWorker("worker.js",{
        dataIn:[
            "amplitude",
            "phase",
            "real",
            "imaginary",
            "signal"
        ],
        dataOut:"complex" || "amplitude" || "phase" || "signal",
        bufferSize:[power of two],
        hopSize:N,
        windowingType,
        zeroPadding:N
    }
// Signal (time domain) would be the default I/O for the AudioFrequencyWorker.
// indicating another datatype would actuate an fft and/or ifft around the user code.

AudioWorker JS

// callback to AudioFrequencyWorkerNode
onaudioprocess = function (e) {
  // e.amplitude[0][channel];
  // e.phase[0][channel];
  // e.real[0][channel];
  // e.imaginary[0][channel];
  // e.signal[0][channel]; // replacing the 'input', time domain.

  // edit the arrays in place, rather than having a separate output array to copy into

  for (var channel = 0; channel<e.amplitude.length;channel++){
    for(var i = 0; i < e.amplitude.length;i++){
        e.amplitude[i][channel] = e.amplitude[i][channel]>0.5 ? e.amplitude[i][channel] : 0;
    }
  }
};

Use Cases

Jesse Allison, Hugh Rawlinson, Jakub Fiala, Nevo Segal @jesseallison, @hughrawlinson, @jakubfiala, @nevosegal

Related

248

262

jesseallison commented 9 years ago

:+1:

jakubfiala commented 9 years ago

:8ball:

nevosegal commented 9 years ago

:+1:

cristiano-belloni commented 9 years ago

+10^5

jesseallison commented 9 years ago

An FFT Bin Modulation Example:

A common audio process of changing the gain on any bin. These could be done through a array of bin modulation values, processed as a whole or on a bin by bin basis.

main JS:

var bufferSize = 2048;
var binCount = bufferSize/2;

var binModNode = createAudioFrequencyWorker("binAmplitudeModulator.js",{
        dataIn: "amplitude",
        dataOut: "signal",
                bufferSize: bufferSize
    }

// binModNode.frequencyBinScalingArray[];  // Assuming parameters could be accessible directly upon instantiation... this may have to be declared.
var cutoffBin = 22;

// creating a cutoff frequency
for(var i=0; i < binCount;i++){
    var scale = (i<cutoffBin) ? 0 : 1;
    binModNode.frequencyBinScalingArray[i] = scale;
}

// adjusting a single bin
binModNode.frequencyBinScalingArray[5] = 2.5;

AudioWorker JS:

// callback to AudioWorkerNode
onaudioprocess = function (e) {
    var channelCount = e.amplitude[0].length;
  var bufferLength = e.amplitude.length;        // This could possibly be automatically generated in the AudioFrequencyWorkerNode

  var frequencyBinScalingArray = e.parameters.frequencyBinScalingArray;

  for (var channel = 0; channel<channelCount;channel++){
    for(var i = 0; i < bufferLength;i++){
        e.amplitude[i][channel] = e.amplitude[i][channel] * frequencyBinScalingArray[i];
    }
  }
};
svgeesus commented 9 years ago

Excellent. I called for the folks who wanted this to write it up at todays Audio WG panel at WAC, and here it is the same day. Ws cool to see it develop on the piratepad too.

rtoy commented 9 years ago

I fail to see what this node provides that an AudioWorker node does not.

I also find it confusing if the AudioFrequencyWorkerNode outputs a frequency domain signal and is then connected to other nodes. Do you then just get random garbage in and out?

I think this also raises the question of what is WebAudio? Is it intended to be a general purpose signal processing package where you can do whatever you want? (Modulo AudioWorkers, where you can do whatever you want, as long as the output is an audio signal. Yes, you can abuse that too, but that's not the intent.)

On Wed, Jan 28, 2015 at 10:41 AM, Nantonos notifications@github.com wrote:

Excellent. I called for the folks who wanted this to write it up at todays Audio WG panel at WAC, and here it is the same day. Ws cool to see it develop on the piratepad too.

— Reply to this email directly or view it on GitHub https://github.com/WebAudio/web-audio-api/issues/468#issuecomment-71890205 .

Ray

hughrawlinson commented 9 years ago

Hi @rtoy,

The AudioFrequencyWorkerNode provides a lot of functionality not available in the AudioWorkerNode. It allows programmers (and composers, game designers, etc. etc. etc.) the ability to manipulate data in the frequency domain without all the prerequisite DSP knowledge necessary to accomplish similar tasks in the time domain.

In response to your second point, I should clarify the processing structure of the AudioFrequencyWorkerNode. The node acts like any other node in that it both accepts input and provides output in the time domain. If you look at the little ascii diagram in the original issue, 'INPUT' and 'OUTPUT' refer to the input and outputs of the node while 'dataIn' and 'dataOut' are the frequency domain arrays that are accessible inside the callback of the AudioFrequencyWorkerNode. The node would take care of the transformation between time and frequency both on the way in to the callback and on the way out of it.

I'm not quite sure what you're getting at with your third point, would you mind clarifying?

hoch commented 9 years ago

all the prerequisite DSP knowledge

I guess the prerequisite here is being able to write FFT/IFFT in AudioWorker. Besides, you still need to have DSP knowledge to manipulate mag/phase properly - doing FFT/IFFT by native code doesn't necessarily reduce the amount of knowledge you need.

Here are more nitpicks: 1) Are you proposing this because you assume FFT/IFFT with JS in AudioWorker will be too slow for realtime applications? 2) Probably browser vendors will not have the identical FFT/IFFT implementation, so what you will have in the node after FFT might vary on different browsers or platforms. (different codec or etc.) Are you okay with that? Doing FFT/IFFT with optimized JS code in AudioWorker will not have that kind of problem.

adelespinasse commented 9 years ago

How about, instead of this, just provide a good set of general-purpose DSP operations for Float32Array, including FFT, inverse FFT, windowing, etc.? That would let you do everything this proposal lets you do (unless I'm missing something), and also lots of other stuff.

hughrawlinson commented 9 years ago

There are other reasons that the AudioWorkerNode isn't suited to dealing with frequency domain data. The buffer size is set at 128 samples, which when converted to the frequency domain gives you bins that are ~345hz wide (if my maths are right). Not exactly ideal. This is to fulfil the design goal that the AudioWorkerNode doesn't introduce any latency into the audio graph. For many purposes in the frequency domain, this isn't ideal.

Probably browser vendors will not have the identical FFT/IFFT implementation

That sounds like something that should be specified... I'm very surprised to see that it seems kind of ambiguous in the spec for the AnalyserNode...

A good set of general-purpose DSP operations is being proposed, in the Web Array Math API whose status I don't know or understand.

rtoy commented 9 years ago

On Wed, Jan 28, 2015 at 3:23 PM, Hugh Rawlinson notifications@github.com wrote:

There are other reasons that the AudioWorkerNode isn't suited to dealing with frequency domain data. The buffer size is set at 128 samples, which when converted to the frequency domain gives you bins that are ~345hz wide (if my maths are right). Not exactly ideal. This is to fulfil the design goal that the AudioWorkerNode doesn't introduce any latency into the audio graph. For many purposes in the frequency domain, this isn't ideal.

​You have no choice on the buffer size. Nodes always get blocks of 128 frames. You have have to buffer internally if you want to process on larger chunks. (This buffering is kind of hidden for ScriptProcessorNodes where the buffering is done for you and the node gets larger buffers all at once.)​

Probably browser vendors will not have the identical FFT/IFFT implementation

That sounds like something that should be specified... I'm very surprised to see that it seems kind of ambiguous in the spec for the AnalyserNode http://webaudio.github.io/web-audio-api/#the-analysernode-interface...

​Oops. That's a bug and we should specify precisely what the FFT is. (I know of at least 2 ways of defining the forward part, and at least 3 ways to scale it.)​

I'm pretty sure, however, that currently everyone does it the same way, and any differences are rounding errors depending on the exact FFT algorithm used.

A good set of general-purpose DSP operations is being proposed, in the Web Array Math API http://opera-mage.github.io/webarraymath/ whose status I don't know or understand.

— Reply to this email directly or view it on GitHub https://github.com/WebAudio/web-audio-api/issues/468#issuecomment-71938414 .

Ray

rtoy commented 9 years ago

On Wed, Jan 28, 2015 at 2:34 PM, Hugh Rawlinson notifications@github.com wrote:

Hi @rtoy https://github.com/rtoy,

The AudioFrequencyWorkerNode provides a lot of functionality not available in the AudioWorkerNode. It allows programmers (and composers, game designers, etc. etc. etc.) the ability to manipulate data in the frequency domain without all the prerequisite DSP knowledge necessary to accomplish similar tasks in the time domain.

​I think that if you're manipulating things in the frequency domain, you have a fair amount of DSP knowledge already. It's easy enough for an AudioWorkerNode to do an FFT internally using any of the available JS FFT libraries out there.​

In response to your second point, I should clarify the processing structure of the AudioFrequencyWorkerNode. The node acts like any other node in that it both accepts input and provides output in the time domain. If you look at the little ascii diagram in the original issue, 'INPUT' and 'OUTPUT' refer to the input and outputs of the node while 'dataIn' and 'dataOut' are the frequency domain arrays that are accessible inside the callback of the AudioFrequencyWorkerNode. The node would take care of the transformation between time and frequency both on the way in to the callback and on the way out of it.

​Ah, thanks for the clarification. Audio in and out makes much more sense.​

I'm not quite sure what you're getting at with your third point, would you mind clarifying?

​Basically, what kind of native nodes should WebAudio supply? ​

​Is the intent to supply a huge set of nodes where you can do just about any general purpose DSP technique as if you were using, say, Matlab-like clone running in a browser? I don't think that's the goal.​

— Reply to this email directly or view it on GitHub https://github.com/WebAudio/web-audio-api/issues/468#issuecomment-71931353 .

Ray

rtoy commented 9 years ago

Aren't there JS libraries out there already to do these kinds of things? I don't see that this falls under WebAudio's goals.

On Wed, Jan 28, 2015 at 3:04 PM, Alan deLespinasse <notifications@github.com

wrote:

How about, instead of this, just provide a good set of general-purpose DSP operations for Float32Array, including FFT, inverse FFT, windowing, etc.? That would let you do everything this proposal lets you do (unless I'm missing something), and also lots of other stuff.

— Reply to this email directly or view it on GitHub https://github.com/WebAudio/web-audio-api/issues/468#issuecomment-71935694 .

Ray

sebpiq commented 9 years ago

​I think that if you're manipulating things in the frequency domain, you have a fair amount of DSP knowledge already

With that reasoning most of the Web Audio API doesn't make sense does it ;)

Even with fair knowledge of DSP, there is a some not-completely trivial stuff there such as overlapping windows and so on.

On Thu, Jan 29, 2015 at 1:00 AM, rtoy notifications@github.com wrote:

Aren't there JS libraries out there already to do these kinds of things? I don't see that this falls under WebAudio's goals.

On Wed, Jan 28, 2015 at 3:04 PM, Alan deLespinasse < notifications@github.com

wrote:

How about, instead of this, just provide a good set of general-purpose DSP operations for Float32Array, including FFT, inverse FFT, windowing, etc.? That would let you do everything this proposal lets you do (unless I'm missing something), and also lots of other stuff.

— Reply to this email directly or view it on GitHub < https://github.com/WebAudio/web-audio-api/issues/468#issuecomment-71935694>

.

Ray

— Reply to this email directly or view it on GitHub https://github.com/WebAudio/web-audio-api/issues/468#issuecomment-71943104 .

Sébastien Piquemal

-----* @sebpiq* ----- http://github.com/sebpiq ----- http://funktion.fm

hoch commented 9 years ago

How about, instead of this, just provide a good set of general-purpose DSP operations for Float32Array, including FFT, inverse FFT, windowing, etc.?

A good set of general-purpose DSP operations is being proposed, in the Web Array Math API whose status I don't know or understand.

@adelespinasse @hughrawlinson That seems to be a nice complement for AudioWorker.

sebpiq commented 9 years ago

Yes... web array math API sounded great but I haven't heard any news for a good 6 months now :(

On Thu, Jan 29, 2015 at 1:07 AM, Hongchan Choi notifications@github.com wrote:

How about, instead of this, just provide a good set of general-purpose DSP operations for Float32Array, including FFT, inverse FFT, windowing, etc.?

A good set of general-purpose DSP operations is being proposed, in the Web Array Math API whose status I don't know or understand.

@adelespinasse https://github.com/adelespinasse @hughrawlinson https://github.com/hughrawlinson That seems to be a nice complement for AudioWorker.

— Reply to this email directly or view it on GitHub https://github.com/WebAudio/web-audio-api/issues/468#issuecomment-71943895 .

Sébastien Piquemal

-----* @sebpiq* ----- http://github.com/sebpiq ----- http://funktion.fm

hughrawlinson commented 9 years ago

I agree with @sebpiq, the ability to do something in AudioWorker doesn't negate the need for a node. You could implement every single node in the spec with AudioWorker, if that's the reasoning then why not just make people implement their own Oscillators with AudioWorker rather than supply OscillatorNode?

​Is the intent to supply a huge set of nodes where you can do just about any general purpose DSP technique as if you were using, say, Matlab-like clone running in a browser? I don't think that's the goal.​

I don't think you need a huge set of nodes to do any general purpose DSP technique. I do however think it's good to be able to work in both the time and frequency domains, and AudioWorker caters for the time domain, so why not have something that caters for the frequency domain? If you want to optimise for as few nodes as possible, then surely we should all be implementing our own oscillators in AudioWorkers rather than using OscillatorNodes, I don't really think having fewer nodes is a great design goal. Having general purpose nodes though, is, and I think AudioFrequencyWorker is general enough to warrant being a node.

@adelespinasse @hughrawlinson That seems to be a nice complement for AudioWorker.

Yeah, the Web Array Math API functions would definitely be useful inside of AudioWorker, but the spec doesn't seem to be progressing, and as it's a separate spec it may not be ready for years even after the Web Audio API is released.

jakubfiala commented 9 years ago

@rtoy So the difference between the knowledge necessary to, say, do basic spectral modulation in your AudioWorker, and performing a highly optimized Fast Fourier Transform is quite vast. The main motivation behind this proposal is really that the majority of Web Audio developers aren't able to do that easily, me being one of them. Given that it's one of the fundaments of audio DSP and pretty much everybody in the field gets to use the FFT spectrum at some point, I think it's more than fitting to have a really damn good, as well as really damn fast version of it in WA.

Not to mention that we already do have an FFT in WA, and this is just an attempt to make it useful for more than just simply visualizing the spectrum, which is pretty much all that the AnalyserNode is good for. Actually, in this sense I can take your argument even further, and say – if there already is a native FFT in Web Audio, why should we have people implementing it again in JS? Isn't that just horribly inefficient?

Mind you, we originally envisaged this as an extension to the normal AudioWorker (so no extra nodes), but after a consultation with Paul Adenot we realised it would actually only work as a separate node mainly because of the fixed buffer size constraint.

cwilso commented 9 years ago

If you're looking for an FFT that produces sound - a frequency-domain transformer - then yes, obvious the Analyser is not for you.

My personal take on this is that this would likely be better served by a separate FFT library, and someone dealing with the array aggregation in interesting ways. (We could provide a "buffering node" that enabled processing of large blocks with corresponding latency - but I'm not personally convinced quite enough that it's necessary (=hard/nonperformant enough to do yourself). At any rate, this is DEFINITELY separate from the straight AudioWorker.

That said - I'll defer to the WG, but I don't think this is a v1 feature. You can, for the moment, build the buffering semantic yourself in a worker, and the FFT in JS (or use Web Array Math if there's an implementation).

On Thu, Jan 29, 2015 at 12:24 PM, Jakub Fiala notifications@github.com wrote:

@rtoy https://github.com/rtoy So the difference between the knowledge necessary to, say, do amplitude modulation in your AudioWorker, and performing a highly optimized Fast Fourier Transform is quite vast. The main motivation behind this proposal is really that the majority of Web Audio developers aren't able to do that easily, me being one of them. Given that it's one of the fundaments of audio DSP and pretty much everybody in the field gets to use the FFT spectrum at some point, I think it's more than fitting to have a really damn good, as well as really damn fast version of it in WA.

Not to mention that we already do have an FFT in WA, and this is just an attempt to make it useful for more than just simply visualizing the spectrum, which is pretty much all that the AnalyserNode is good for. Actually, in this sense I can take your argument even further, and say – if there already is a native FFT in Web Audio, why should we have people implementing it again in JS? Isn't that just horribly inefficient?

Mind you, we originally envisaged this as an extension to the normal AudioWorker (so no extra nodes), but after a consultation with Paul Adenot we realised it would actually only work as a separate node mainly because of the fixed buffer size constraint.

— Reply to this email directly or view it on GitHub https://github.com/WebAudio/web-audio-api/issues/468#issuecomment-71952201 .

chrislo commented 9 years ago

I think @cwilso's point is a very valid one. I think the existence of AudioWorker and presumably the proliferation of libraries that will then be developed to handle the buffering/aggregation and (I)FFT maths will make it very clear to us over time what a future AudioFrequencyWorker's interface should look like.

Exposing all of the buffering parameters, as well as those of the FFT algorithm (specifying overlap, window functions, scaling etc) is going to be a hard thing to get right - I imagine a consensus on this might develop over time in user code that can be perhaps incorporated into the API later for performance / ease of use reasons.

chrislo commented 9 years ago

(That being said, I think it's fantastic that this issue has been raised as a direct consequence of the public Audio WG consultation meeting on Wednesday)

notthetup commented 9 years ago

Does anyone have any performance numbers on optimised JS implementations of FFT running inside ScriptProcessorNode along with FFT implementation optimised with asm.js + SIMD.js (and maybe also PNaCl)?

I feel knowing performance differences between native and JS implementations would be very important to understand what we're actually gaining from making this functionality into a native Node.

Ofcourse, strictly speaking we should looking at the performance in the AudioWorker (closer to native Nodes in terms of threading), but that will have to wait, so we could start with ScriptProcessorNode for now.

cristiano-belloni commented 9 years ago

@notthetup - A good starting point could be this library: https://github.com/corbanbrook/dsp.js/. Although it's not asm.js optimized, the RFFT class in the lib performs a forward FFT.

A good native FFT implementation is http://www.fftw.org/, for comparison.

cristiano-belloni commented 9 years ago

@rtoy:

​I think that if you're manipulating things in the frequency domain, you have a fair amount of DSP knowledge already. It's easy enough for an AudioWorkerNode to do an FFT internally using any of the available JS FFT libraries out there.​

Yep, but isn't it re-inventing the [possibly square] wheel? The FFT is a building block of DSP, it makes sense to have it implemented efficiently and natively, without everyone having to re-implement it, or using libraries that re-implement it.

Compare to oscillators: it's reasonably easy to implement an oscillator in any given programming language, but the Web Audio API implements them natively and efficiently, with a single interface, because they're elementary building blocks of audio processing. Why the FFT shouldn't be treated equally?

notthetup commented 9 years ago

@janesconference Thanks. I also found this which seems to use asm.js. https://github.com/g200kg/Fft-asm.js/commits/master

I will try to create a example to test performance when I'm back home.

rtoy commented 9 years ago

On Wed, Jan 28, 2015 at 4:25 PM, Hugh Rawlinson notifications@github.com wrote:

I agree with @sebpiq https://github.com/sebpiq, the ability to do something in AudioWorker doesn't negate the need for a node. You could implement every single node in the spec with AudioWorker, if that's the reasoning then why not just make people implement their own Oscillators with AudioWorker rather than supply OscillatorNode?

​Perhaps if history had proceeded differently and an AudioWorker existed from the beginning, all nodes would have been defined or implemented in terms of an AudioWorker. That's not how history went. I'm not sure, but it's conceivable that implementations could change existing nodes to be AudioWorkers underneath if so desired.​

​Is the intent to supply a huge set of nodes where you can do just about any general purpose DSP technique as if you were using, say, Matlab-like clone running in a browser? I don't think that's the goal.​

I don't think you need a huge set of nodes to do any general purpose DSP technique. I do however think it's good to be able to work in both the time and frequency domains, and AudioWorker caters for the time domain, so why not have something that caters for the frequency domain? If you want to optimise for as few nodes as possible, then surely we should all be implementing our own oscillators in AudioWorkers rather than using OscillatorNodes, I don't really think having fewer nodes is a great design goal. Having general purpose nodes though, is, and I think AudioFrequencyWorker is general enough to warrant being a node.

​I grant that it is useful. But not beyond what an AudioWorker could do just as well.​

​And given the fact that even with the few nodes we have today, we still have failed after several years to specify them completely enough that someone could implement them from the spec. Fewer nodes are good. :-)​

@adelespinasse https://github.com/adelespinasse @hughrawlinson https://github.com/hughrawlinson That seems to be a nice complement for AudioWorker.

Yeah, the Web Array Math API functions would definitely be useful inside of AudioWorker, but the spec doesn't seem to be progressing, and as it's a separate spec it may not be ready for years even after the Web Audio API is released.

— Reply to this email directly or view it on GitHub https://github.com/WebAudio/web-audio-api/issues/468#issuecomment-71945937 .

Ray

cristiano-belloni commented 9 years ago

​Perhaps if history had proceeded differently and an AudioWorker existed from the beginning, all nodes would have been defined or implemented in terms of an AudioWorker. That's not how history went. I'm not sure, but it's conceivable that implementations could change existing nodes to be AudioWorkers underneath if so desired.​

That's interesting. I was under the impression that nodes were specialized, fast and memory-friendly ways to implement building blocks? Wouldn't be penalizing implementing them via a worker? (e.g.: easily discardable AudioBufferSourceNodes)

notthetup commented 9 years ago

Sorry this took a while. But I have a basic performance test setup for how long it takes to run FFTs using some of the popular FFT libraries, plotted against ScriptProcessor callback times. Of course we would want to do this with the AudioWorker when it comes, but this gives us data points to start with. Also please let me know if you see any bugs.

http://chinpen.net/webaudiofftperf/

jakubfiala commented 9 years ago

@notthetup awesome! As far as I understand, the fft-asm.js option isn't working yet, right?

notthetup commented 9 years ago

@jakubfiala Yes. I haven't implemented that yet. Hopefully this weekend.

echo66 commented 9 years ago

@notthetup , AFAIK, you currently have two options for 1D FFT, implemented using ASM.js:

https://github.com/echo66/FFTW3-emscripten https://github.com/drom/fourier

benchmarks:

http://jsperf.com/comparing-two-fft-functions/3 http://jsperf.com/comparing-two-fft-functions/5 http://jsperf.com/benchmark-for-fftw-js-and-drom-s-fft/2

notthetup commented 9 years ago

@echo66 Thanks!! Will look at those

If anyone has some time (I'm really flooded for the next few weeks), please feel free to PR https://github.com/notthetup/webaudiofftperf

echo66 commented 9 years ago

Greetings to everyone!

In the last month, I have been working in real-time time stretching and pitch shifting using a phase vocoder, using my own implementations and @janesconference PitchShifter. Currently, there seems to exist a big bottleneck in FFT calculation, resulting in audio dropouts if you have more than 4-5 nodes with time-stretching and/or pitch shifting. For me (at least), the absence of an alternative to the javascript FFT implementations is a major roadblock.

Note 1: I forgot about the pitch shifter implemented by @cwilso . That one doesn't seem to suffer from what I mentioned I the previous paragraph. Unfortunately, for me, it provides a lower (audio) quality than @janesconference implementation or my phase vocoder + resampling. Note 2: It should be noted that the implementations I have created/experimented with do not use SIMD.

cwilso commented 9 years ago

The pitch shifting I did is granular resynthesis, which is going to be lower audio quality. I'm expecting a lot of the dropouts you're experiencing are because you're using ScriptProcessors for the nodes (yes?), and the thread-hopping starts thrashing.

echo66 commented 9 years ago

Yes, I'm using ScriptProcessors, @cwilso .

cristiano-belloni commented 9 years ago

My pitchshifter, as far as I remember, uses the FFT implementation found here: https://github.com/corbanbrook/dsp.js/blob/master/dsp.js

I remember it did real-time shifting back in the days, but consuming a lot of CPU. Maybe, if you replace the FFT with a faster one, the performance will be better.

[Incidentally, is this not a exemplary case where a native FFT would be very useful? You did a lot of Jsperf comparison in this thread, but what happens if we compare an asm.js implementation to a native, efficient one?]

echo66 commented 9 years ago

I replaced by this implementation: https://github.com/drom/fourier. I can do a pull request, if you want, @janesconference .

The implementation you used, when compared with Matlab and FFTW output, it is returning strange values (I think I even opened an issue in dsp.js repo).

I agree with you that this is an exemplary use case for native FFT. And due to the historic importance of the FFT in DSP, I find it odd that it is not considered in WAA. One could say that there are other important transforms but, if that's the case, why is there an Analyzer Node that outputs the FFT magnitude? It could be implemented as a ScriptProcessor (IMHO). And why doesn't the Analyzer Node provide the FFT phase?

By the way: I don't remember the last time I saw a DSP library/framework/platform (besides WAA) without explicit support for FFT.

rtoy commented 9 years ago

I have never considered WebAudio as a DSP library/framework/platform. Hence, I don't miss an FFT.

cristiano-belloni commented 9 years ago

I can do a pull request, if you want

Yes, thanks. Pitchshifter was made years ago, and I didn't find the time to update it properly :)

echo66 commented 9 years ago

@rtoy , then what is WAA?

cristiano-belloni commented 9 years ago

I have never considered WebAudio as a DSP library/framework/platform. Hence, I don't miss an FFT

Well, it offers mid-level components, such as oscillators and biquads, and fairly high-level components, like the panner and the compressor. It sure is not a library / framework / platform, but it is not a bare-bones API, either. I wonder why - and I say that with sincere curiosity - a compressor is OK but a Fourier transform is too much.

adelespinasse commented 9 years ago

Things like FFT don't belong in the Web Audio API because they aren't specific to audio and don't need to be coupled with an AudioContext. They're just operations on arrays. It would be really nice if there were a standard library of fast vector operations (last I knew there wasn't even a way to add two Float32Arrays without writing a loop in JavaScript), and I would strongly favor having some transforms in that (especially since there's already FFT code in each browser), but they don't belong in the Web Audio API.

This is the fundamental reason why the original proposal of having an AudioFrequencyWorkerNode struck me as API bloat. It's much better to have orthogonal APIs that can be combined in arbitrary ways.

On Wed, May 20, 2015 at 6:32 PM, Cristiano Belloni <notifications@github.com

wrote:

I have never considered WebAudio as a DSP library/framework/platform. Hence, I don't miss an FFT

Well, it offers mid-level components, such as oscillators and biquads, and fairly high-level components, like the panner and the compressor. It sure is not a library / framework / platform, but it is not a bare-bones API, either. I wonder why - and I say that with sincere curiosity - a compressor is OK but a Fourier transform is too much.

— Reply to this email directly or view it on GitHub https://github.com/WebAudio/web-audio-api/issues/468#issuecomment-104062614 .

notthetup commented 9 years ago

SIMD.js is should help with optimising the vector operation. But we'll have to wait till ES7 for that to be standardised, although parts of the implementation are already in FF-Nightly and Chrome-Nightly.

https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/2PIOEJG_aYY

echo66 commented 9 years ago

@adelespinasse , I understand your point. Still, if that is the case, why is there an AnalyzerNode? And why is it a FFT instead, idk, a wavelet transform (the fourier transform is a special case of the wavelet transform, AFAIK)? Since I'm not part of the WAA steering panel, I'm just trying to get the reason why you chose to create the Analyser Node (and why only the magnitude is being reported) instead of standardizing an FFT API. If it is just for spectrum rendering, we could just use a javascript FFT implementation inside a ScriptProcessor (after all, we still need a ScriptProcessor to render the spectrum).

artofmus commented 9 years ago

The problem is that it is impossible implement FFT WINDOWING in web audio. For this it is necessary DelayNode could be less than one buffer (for example 1 sample), or need to work with several buffers in ScriptProcessorNode (present and previous) for overlaping. Or I'm wrong? Tell me how you solved the problem of overlap windows in AudioFrequencyWorkerNode?

echo66 commented 9 years ago

@hughrawlinson , I think @artofmus is talking to you.

echo66 commented 9 years ago

@artofmus , is there an issue, for you, to use a ScriptProcessor/AudioWorker as buffer reader, which performs windowing and takes care of the hop sizes?

artofmus commented 9 years ago

@echo66 so, yes. And what are the alternatives ScriptProcessor/AudioWorker for this?

echo66 commented 9 years ago

If you are reading an AudioBuffer, you can use a ScriptProcessor/AudioWork to iterate through the samples. Take a look at this code excerpt:

node.onaudioprocess = function (e) {

    var il = buffer.getChannelData(0);
    var ir = buffer.getChannelData(1);

    var ol = e.outputBuffer.getChannelData(0);
    var or = e.outputBuffer.getChannelData(1);

    // Fill output buffers (left & right) until the system has enough processed samples to reproduce.
    do {

        var buf1 = il.subarray(position, position+BUFFER_SIZE);
        var buf2 = ir.subarray(position, position+BUFFER_SIZE);

        position += phasevocoder1.get_analysis_hop();

        // Process left input channel
        outBuffer1 = outBuffer1.concat(phasevocoder1.process(buf1));

        // Process right input channel
        outBuffer2 = outBuffer2.concat(phasevocoder2.process(buf2));

    } while(outBuffer1.length < BUFFER_SIZE);

    ol.set(outBuffer1.splice(0, BUFFER_SIZE));
    or.set(outBuffer2.splice(0, BUFFER_SIZE));

}

At each execution of the onaudioprocess method, I'm processing several frames, each one overlapping the previous one(s) (phasevocoder1.get_analysis_hop()). By using a ScriptProcessor, I can (manually) control the frame hopping at the input. To control it at the output, you need an intermediary array/buffer.