WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.05k stars 167 forks source link

Describe the algorithm that DynamicsCompressorNode should use #10

Closed olivierthereaux closed 7 years ago

olivierthereaux commented 11 years ago

Originally reported on W3C Bugzilla ISSUE-19885 Wed, 07 Nov 2012 00:51:17 GMT Reported by Ehsan Akhgari [:ehsan] Assigned to

Currently the spec doesn't provide much information on what the algorithm behind DynamicsCompressorNode should look like, which is not very helpful for implementers.

russellmcc commented 10 years ago

This just came up on the mailing list because at least some implementations use Automatic Make-up Gain, which is not a ubiquitous feature on real, professional compressors. After looking in the spec, it's not even specified whether the DynamicsCompressorNode should use auto make-up or not. This (make-up gain) should probably be a user option.

cwilso commented 9 years ago

@padenot weren't you working on this?

padenot commented 9 years ago

I've started to gather some notes, yes.

chrislo commented 9 years ago

Need some help? I can take a look. On 30 Oct 2014 22:08, "Paul ADENOT" notifications@github.com wrote:

I've started to gather some notes, yes.

— Reply to this email directly or view it on GitHub https://github.com/WebAudio/web-audio-api/issues/10#issuecomment-61178320 .

cwilso commented 8 years ago

Once this is done, please ping me on #13 so I can incorporate.

joeberkovitz commented 8 years ago

The decision is to reverse engineer the code and document the current algorithm so that we're not wedged on further decision making, and can do a more disciplined compression/expansion algorithm in v.next

joeberkovitz commented 8 years ago

Noted: we're going to attempt to progressively describe the algorithm starting at a general level of detail of what components, states and procedures constitute the compression behavior. This in itself will be a big step over the current spec. Informative sections supplying suggested behavior can then be written, lifted from the current implementation, but we're not going to prescribe that implementations exactly mimic the current one in every last detail.

svgeesus commented 8 years ago

This paper seems approachable, clear and useful: Digital Dynamic Range Compressor Design— A Tutorial and Analysis https://www.eecs.qmul.ac.uk/~josh/documents/GiannoulisMassbergReiss-dynamicrangecompression-JAES2012.pdf It seems clear from that analysis that our compressor is a feedforward type, with an RMS detector, and that both attack and release are exponential. So we could start by describing those blocks (and a diagram of how they fit together).

svgeesus commented 8 years ago

This is also helpful, but more mathematical and aimed at analog circuitry rather than digital implementation Attack and Release Time Constants in RMS - Based Compressors and Limiters http://www.thatcorp.com/datashts/AES4054_Attack_and_Release_Time_Constants_II.pdf

padenot commented 8 years ago

Thanks, I'll be reading those for sure. I plan to start working on this after I'm finished with the AudioBufferSourceNode rendering spec bit (#95).

joeberkovitz commented 7 years ago

Adding @rtoy as discussed on yesterday's call - @padenot we need notes from you on how to proceed.

padenot commented 7 years ago

The current DSP chain of the compressor is the following:

The audio signal is the audio to be processed. The control signal is the audio that will determine how much compression should be applied to the audio signal. Those two signals can be the same, or different, when side-chaining.

There is also a thing about pre-warping the power average that I don't understand yet.

Based on this, we can easily standardize the following:

The problematic part is the computation of the target attenuation. We could decide that implementations should have a curve that follows a hard-knee compression curve (that is easily speccable), but that having a soft knee is allowed, without too much restriction. We can say that the knee MUST be a monotonously growing function that goes from the knee threshold (i.e. the minimum value that will trigger some compression) to the knee end (where the compression is linear again), without discontinuities with the regular compression curve. In other word, some function that joins the first and second part of the compression curve, without discontinuities.

Compressors are implemented with a variety of techniques (i've read a number of open-source implementations to get a sense of the topic), and specifying a compressor based on the algorithm seems quite challenging (but doable), as well as quite limiting. I'm wondering that maybe it would be clearer to describe a compressor with less features (for example, having a hard knee, no emphasis, no adaptive release, no pre and post filtering), and stating that vendors are allowed to implement something a little bit different (a bit like in SVG where the blur is specced as a Gaussian blur, but where it's stated that a triple pass box blur is acceptable).

Thoughts?

hoch commented 7 years ago

FWIW, Chrome removed the pre-emphasis filter a while ago. We don't want any redundant coloration. The basic building block must be transparent and fast. The compressor node violates this rule in many aspects, but the pre-emphasis was the easiest one to remove.

There is also a thing about pre-warping the power average that I don't understand yet.

Yeap, that's where I stopped. :)

Few thoughts:

  1. The static numbers in the code is quite arbitrary and I am not sure how we can make decisions on them. Are these numbers to be specified?
  2. The current implementation has the adaptive release (program-based release) that requires a large look-ahead. This is not ideal for the individual track compression - I was told our compressor is sort of designed for the 'master compression' and this is not specced anywhere as well.
joeberkovitz commented 7 years ago

As per call today:

joeberkovitz commented 7 years ago

Also we need to add expander functionality so that we can at least approach a noise-gate-like feature.

joeberkovitz commented 7 years ago

Asking @rtoy to take the lead on this at this point. Input and assistance from @padenot would be appreciated of course!

joeberkovitz commented 7 years ago

@padenot is picking this up again, to have a PR ready prior to the F2F by his request.

joeberkovitz commented 7 years ago

Here are some notes that I've gleaned based on reading the code, as an independent source of material for the forthcoming writeup. Hopefully it's useful to have another pair of eyes on this. Like Paul I'm trying to characterize what needs to be speced, not the exact details of what the code does.

Signal processing path:

I don't hve anything to add to @padenot's suggestion the curve, except to clarify that there are three distinct regions of the curve: linear, knee, and ratio-driven. I think we could just say that the impl computes the knee to ensure a smooth transition up to the first derivative.

Finally, some comments on the existing spec language:

joeberkovitz commented 7 years ago

From F2F: We reviewed @padenot's draft changes for this description. To complete PR by July 6.

padenot commented 7 years ago

This is up for review in #1278.

joeberkovitz commented 7 years ago

Closed via #1278