Closed alvestrand closed 1 year ago
Discussed in WG meeting April 26, 2022. Issue raised: Should it be a string (for extensibility) rather than a boolean?
Presentation link to April 26 presentation: https://docs.google.com/presentation/d/15iAIhzpaA6reKJBL-ecgYtic6ZKHEpKL5OK_sExTllc/edit#slide=id.g1233c72d2fa_0_18
so far, discussion about extensibility has not provided arguments that warrant the added complexity. Leaving it as a boolean.
We know that noise cancellation can be quite effective in many scenarios. However, noise cancellation is, by default, somewhat restrictive in what it considers "noise", in order to lessen the chance that it is damping stuff that the recipient wants to hear.
There are quite powerful algorithms out there that allow better noise removal if we're more sure what the recipient wants to hear - such as removing anything that does not form part of a human voice.
This behavior is sometimes desirable (such as in person to person conversation), and sometimes very undesirable (such as when playing music to each other).
Suggestion: Add a new constraint "voiceIsolation" (values true & false) that, when true, tries to isolate the human voice and remove all other parts of the audio signal. This may also enable features such as directionality (beam-forming) that attempt to take signal only from the direction from which a human voice is detected.