Optional ducking of other audio

nvaccessAuto commented 10 years ago

Reported by jteh on 2014-01-31 01:54 Windows 8 allows us to request that other audio be ducked. This means we can duck the volume of other audio while NVDA is speaking. This should be optional, as it isn't always desirable. Also, I wonder whether it might be useful to have a command to duck audio even if this option is disabled for times when unexpected loud audio clobbers speech.

See the audioDuckingPrototype branch.

nvaccessAuto commented 10 years ago

Comment 1 by k_kolev1985 on 2014-03-16 18:38 Hi Jamie,

I want to ask a question about the volume ducking feature. Will it duck the volume constantly when NVDA is running, or it will duck it only when NVDA is speaking? I'm asking, because Narrator ducks it constantly, witch is not an ideal solution.

nvaccessAuto commented 10 years ago

Comment 2 by mdcurran on 2014-03-25 00:35 Our prototype ducks only when NVDA is speaking. However, as the fade time that Windows uses when starting to duck is a bit slower than we'd like, we have to slightly delay the start of speech otherwise NVDA will start speaking while the background audio is still at full volume.

nvaccessAuto commented 10 years ago

Comment 3 by zahari_bgr on 2014-03-25 01:39 Hi, It will be a very nice feature. Is it possible to detect whether there is actual background audio currently playing? Cause a constant delay in speech will not be nice. It will be useful if toggling of this option could be bound to an input gesture. Also, what about Windows 7?

nvaccessAuto commented 10 years ago

Comment 4 by mdcurran on 2014-03-25 03:16 We only plan to support Windows 8 at this point in time as the Operating System has the support already built in. We have no way of detecting whether there is background audio or not, therefore we probably will have a guesture to toggle it on and off.

nvaccessAuto commented 9 years ago

Comment 5 by mdcurran on 2014-12-02 02:56 A prototype of this stupport can be found in branch t3830

nvaccessAuto commented 9 years ago

Comment 6 by mdcurran on 2014-12-03 07:27 A try build, now supporting no ducking, ducking for NVDA speech and sounds, and ducking always, (NVDA+shift+d and combobox in synth dialog) can be found at: http://community.nvda-project.org/try/t3830/nvda_snapshot_try-t3830-10573,7f8c9e1.exe

nvaccessAuto commented 9 years ago

Comment 7 by leonarddr on 2015-04-24 20:40 Is there any idea of when this work could find its way to next? I tried the functionality from source on windows 8.1, but am getting the message 'ducking not supported'

nvaccessAuto commented 9 years ago

Comment 8 by mdcurran on 2015-04-26 18:56 This functionality can only work for installed copies due to Windows security. I think there were still some race conditions where audio would accidentially remain ducked. I'll take another look soon.

nvaccessAuto commented 9 years ago

Comment 9 by camlorn on 2015-05-18 19:33 This is almost certainly blocked by #5096 as, if we move NVWave to C/C++, we'll need to redo some of this there.

nvaccessAuto commented 9 years ago

Comment 10 by jteh on 2015-10-28 00:36 We need to make sure this cooperates with prevention of our audio being ducked by other audio (#5443).

nvaccessAuto commented 8 years ago

Incubated in ca7e036c4e66390977b257e2551914854f4e5076.

LeonarddeR commented 8 years ago

It seems that ducking and configuration profiles don't like each other.

To reproduce, enable ducking in a configuration profile with a trigger and disable it in the default configuration. When switching to the application with the ducking enable profile, ducking isn't enabled as expected.

jcsteh commented 8 years ago

@michaelDCurran: You need to switch ducking modes when config profiles are switched. Introduce a handleConfigProfileSwitch function in nvwave and call it from config.ConfigManager._handleProfileSwitch (config/init.py around line 510).

k-kolev1985 commented 8 years ago

I want to report a bug with the current (available in the "next" snapshots) implementation of the audio ducking feature. It seams not to work with SAPI5. I've tried with 2 bulgarian voices and with Microsoft Zira voice, but it did not work. The option is set to "Duck when outputting speech and sounds". If the option is set to "Always duck", the feature works as expected. Is this a limitation with SAPI5 or a bug in the current implementation of the audio ducking feature? It works with other TTS engines like "Speech Player" and "RHVoice" (the older one - I don't like the newer one very much) witch are as add-ons for NVDA.

Test environment:

Operating system: Windows 10 Pro (build 10586.11), 64-bit, in Bulgarian with all locale settings set to "Bulgarian".
NVDA version: next-12831,b1f00f4.
Processor: Intel Core i5-2320 at 3.00GHz.
RAM Memory: 4.00 GB.
Sound Card: Realtek ALC662 at Intel Cougar Point PCH - High Definition Audio Controller.

jcsteh commented 8 years ago

This isn't currently possible because SAPI 5 handles its own audio output; it doesn't use NVDA code to do this. It should be possible for us to fix this with some rewriting of the SAPI 5 driver, but we don't plan to do this for the initial implementation of ducking.

nvaccessAuto commented 8 years ago

Incubated in 071a65396b68bafade119e81e48b4d779d1a564a.

nvaccessAuto commented 8 years ago

Incubated in fe9c9397949da33b2f7665ac172d6ca4fe92fd3b.

nvaccessAuto commented 8 years ago

Incubated in 9fa91c2f07da195bc08992fd94d5b8bf6695be9a.

jcsteh commented 8 years ago

I think it'd be really good if we can not delay the audio if there isn't audio playing (or better still, if audio is quiet). So, I've been looking into this a bit.

It seems there are two APIs you can use to get the peak levels for an audio device:

The Windows Core Audio APIs in Vista and later. These are COM based. Basically, you start with IMMDeviceEnumerator, get an IMMDevice and call IMMDevice::Activate and request IAudioMeterInformation.
- Unfortunately, as nice as being COM based sounds, calling CoCreateInstance for IMMDeviceEnumerator results in class not registered for some reason. mmdevapi.dll doesn't seem to have a typelib, so I tried generating one from mmdeviceapi.idl. When I tried to use that with comtypes, it threw a weird assertion error.
- This StackOverflow thread shows it's possible to use this API with comtypes, but the interfaces would all have to be written out.
The old Audio Mixer API from winmm. This article provides details about how this could be used to get peak levels.
- Unfortunately, there would be a hell of a lot of structs to write out if we wanted to do this in Python.

Of course, I guess we could write the code in C++ and avoid the Python porting bit. After all, all we want is one damned value.

jcsteh commented 8 years ago

However, the above should perhaps be done separately, as it looks like it's pretty complicated to implement and it is nice-to-have rather than essential. Thoughts, @MichaelDCurran?

michaelDCurran commented 8 years ago

Certainly sounds cool, but:

Windows ducks audio of all sound cards, not just the default. I assume we'll want to check if there is audio playing on any available card? Or should we just use NVDA's currently configured output device?

Also, depending on the speed of the delay in the peak meters, it may inadvertently detect NVDA's own speech. Though if we do the check at the lowest level (i.e.) audioDucking._setDuckingState, this is guaranteed not to be called for at least a second after any speech finishes. _setDuckingState will also need to return True or false based on whether it actually did change state. If asking to duck and _setDuckingState returns false, _requestDucking should not do a

time.sleep.

michaelDCurran commented 8 years ago

Actually, my last idea was a little too over engineered, and incorrect.

Rather than changing _setduckingState, _requestDucking simply does not have to do the time.sleep if no other audio is playing. _setDuckingState will still technically duck, in case other audio does start playing while NVDA is speaking.

On 9/12/2015 1:28 AM, Michael Curran wrote:

Certainly sounds cool, but:

Windows ducks audio of all sound cards, not just the default. I assume we'll want to check if there is audio playing on any available card? Or should we just use NVDA's currently configured output device?

Also, depending on the speed of the delay in the peak meters, it may inadvertently detect NVDA's own speech. Though if we do the check at the lowest level (i.e.) audioDucking._setDuckingState, this is guaranteed not to be called for at least a second after any speech finishes. _setDuckingState will also need to return True or false based on whether it actually did change state. If asking to duck and _setDuckingState returns false, _requestDucking should not do a time.sleep.

On 8/12/2015 9:53 PM, James Teh wrote:

I think it'd be really good if we can not delay the audio if there isn't audio playing (or better still, if audio is quiet). So, I've been looking into this a bit.

It seems there are two APIs you can use to get the peak levels for an audio device:

The Windows Core Audio APIs https://msdn.microsoft.com/en-gb/library/windows/desktop/dd370784%28v=vs.85%29.aspx in Vista and later. These are COM based. Basically, you start with IMMDeviceEnumerator https://msdn.microsoft.com/en-gb/library/windows/desktop/dd371399%28v=vs.85%29.aspx, get an IMMDevice https://msdn.microsoft.com/en-gb/library/windows/desktop/dd371395%28v=vs.85%29.aspx and call IMMDevice::Activate https://msdn.microsoft.com/en-gb/library/windows/desktop/dd371405%28v=vs.85%29.aspx and request IAudioMeterInformation https://msdn.microsoft.com/en-gb/library/windows/desktop/dd368227%28v=vs.85%29.aspx.

Unfortunately, as nice as being COM based sounds, calling CoCreateInstance for IMMDeviceEnumerator results in class not registered for some reason. mmdevapi.dll doesn't seem to have a typelib, so I tried generating one from mmdeviceapi.idl. When I tried to use that with comtypes, it threw a weird assertion error.

This StackOverflow thread http://stackoverflow.com/questions/32149809/read-and-or-change-windows-8-master-volume-in-python shows it's possible to use this API with comtypes, but the interfaces would all have to be written out.

The old Audio Mixer API https://msdn.microsoft.com/en-us/library/dd756701%28v=vs.85%29.aspx from winmm. This article https://support.microsoft.com/en-us/kb/181550 provides details about how this could be used to get peak levels.

Unfortunately, there would be a hell of a lot of structs to write out if we wanted to do this in Python.

Of course, I guess we could write the code in C++ and avoid the Python porting bit. After all, all we want is one damned value.

— Reply to this email directly or view it on GitHub https://github.com/nvaccess/nvda/issues/3830#issuecomment-162860239.

Michael Curran Executive Director, NV Access Limited Phone: +61 7 3149 3306 Website:http://www.nvaccess.org/ Twitter: @NVAccess Facebook:http://www.facebook.com/NVAccess

Michael Curran Executive Director, NV Access Limited Phone: +61 7 3149 3306 Website: http://www.nvaccess.org/ Twitter: @NVAccess Facebook: http://www.facebook.com/NVAccess

jcsteh commented 8 years ago

Ah, I was wondering what audio device(s) Windows ducked. I assumed it was just the default.

I guess then it'd make sense to check for audio on all devices.

michaelDCurran commented 8 years ago

Using the winmm technique, no waveOut devices of my two sound cards support the peak metre control. Several posts on the web suggest that cards and/or Windows has not supported this for years. The stack Exchange article also mentions that winmm is now pretty much local to each application (i.e. the mixer API is specific for the process calling it). And the stack exchange example itself, only dealt with setting/getting volume control levels, not peak levels as such. I shall have a look at the COM interfaces in more detail to see if any kind of peak metre functionality exists...

jcsteh commented 8 years ago

Damn.

The stack Exchange article also mentions that winmm is now pretty much local to each application (i.e. the mixer API is specific for the process calling it). And the stack exchange example itself, only dealt with setting/getting volume control levels, not peak levels as such.

Not that this matters, but out of interest, I assume you're referring to some other article you found (and I probably saw as well)? The article I linked here was from support.microsoft.com and dealt with metering, not volume. The StackOverflow article i linked was for Core Audio. Still, the support article was quite old.

I shall have a look at the COM interfaces in more detail to see if any kind of peak metre functionality exists...

IAudioMeterInformation has this to say:

If the audio device lacks a hardware peak meter, the audio engine automatically implements the peak meter in software, transparently to the client.

So I presume that means we can get what we need.

nvaccessAuto commented 8 years ago

Incubated in 4b7eeb73f26e9723da02450633ef97774f38ab29.

jcsteh commented 8 years ago

Very nice!

Further info on the duck delay not happening sometimes as discussed yesterday. There are two issues.

STR for the first issue:

Play some audio in the background.
Switch to ducking for speech and sounds.
Wait until audio unducks.
Press NVDA+shift+d to switch to always duck.
- Expected: Sleep before speech.
- Actual: No sleep.

STR for the second issue:

Play some audio in the background.
Switch to no ducking.
After the audio unducks but before 1 second has elapsed, press NVDA+shift+d to switch to ducking for speech and sounds.
- Expected: Sleep before speech.
- Actual: No sleep.
- Audio unducks immediately, which makes sense, but the callLater still seems to apply, so it doesn't think it has to sleep.

nvaccessAuto commented 8 years ago

Incubated in dfa326192beb408bda59732f5e08599a8b098a5d.

LeonarddeR commented 8 years ago

I found a little issue with the current ducking implementation in next.

STR in Windows 10:

Set NVDA to no ducking
Start Narrator with ctrl+win+u. Narrator ducks all other audio, except for NVDA
Disable Narrator again with ctrl+win+U
- Expected: Ducking is entirely disabled again
- Actual: NVDA ducks all other audio until closed or ducking is disabled again with nvda+shift+d

michaelDCurran commented 8 years ago

Annoyingly this is not a bug we will be able to fix. The issue is how the AccSetRunningUtilityState function in Windows is implemented. Any two assistive technologies that try and control audio ducking at the same time will most likely cause this.

It should also be noted that we do not recommend running another Screen Reader at the same time as NVDA (including narrator). Of course people do it (us certainly as developers do from time to time) however we cannot promise system stability if doing so.

On 15/12/2015 7:59 PM, Leonard de Ruijter wrote:

I found a little issue with the current ducking implementation in next.

STR in Windows 10:

Set NVDA to no ducking

Start Narrator with ctrl+win+u. Narrator ducks all other audio, except for NVDA

Disable Narrator again with ctrl+win+U

Expected: Ducking is entirely disabled again

Actual: NVDA ducks all other audio until closed or ducking is disabled again with nvda+shift+d

— Reply to this email directly or view it on GitHub https://github.com/nvaccess/nvda/issues/3830#issuecomment-164708112.

Michael Curran Executive Director, NV Access Limited Phone: +61 7 3149 3306 Website: http://www.nvaccess.org/ Twitter: @NVAccess Facebook: http://www.facebook.com/NVAccess

JamaicanUser commented 8 years ago

Settings for controlling this are not present in the General settings dialog. They are present in the Synthesizer dialog. Fix in What's New only.

michaelDCurran commented 8 years ago

They are in the Synthesizer dialog.

What lead you to believe they were in General Settings? On 7/01/2016 10:35 PM, JamaicanUser wrote:

Settings for controlling this are not present in the General settings dialog.

— Reply to this email directly or view it on GitHub https://github.com/nvaccess/nvda/issues/3830#issuecomment-169650796.

Michael Curran Executive Director, NV Access Limited Phone: +61 7 3149 3306 Website: http://www.nvaccess.org/ Twitter: @NVAccess Facebook: http://www.facebook.com/NVAccess

derekriemer commented 8 years ago

Oddly I thought this at first as well. Maybe we should put what dialog it is in the whats new?

On 1/7/2016 11:16 AM, Michael Curran wrote:

They are in the Synthesizer dialog.

What lead you to believe they were in General Settings? On 7/01/2016 10:35 PM, JamaicanUser wrote:

Settings for controlling this are not present in the General settings dialog.

— Reply to this email directly or view it on GitHub https://github.com/nvaccess/nvda/issues/3830#issuecomment-169650796.

Michael Curran Executive Director, NV Access Limited Phone: +61 7 3149 3306 Website: http://www.nvaccess.org/ Twitter: @NVAccess Facebook: http://www.facebook.com/NVAccess

— Reply to this email directly or view it on GitHub https://github.com/nvaccess/nvda/issues/3830#issuecomment-169760994.

Derek Riemer

Department of computer science, third year undergraduate student.
Proud user of the NVDA screen reader.
Open source enthusiast.
Member of Bridge Cu
Avid skiier.

Websites: Honors portfolio http://derekriemer.drupalgardens.com Non-proffessional website. http://derekriemer.pythonanywhere.com/personal Awesome little hand built weather app that rocks! http://derekriemer.pythonanywhere.com/weather

email me at derek.riemer@colorado.edu mailto:derek.riemer@colorado.edu Phone: (303) 906-2194

JamaicanUser commented 8 years ago

@michaelDCurran in the What's New

jcsteh commented 8 years ago

Err, my bad. Brain detached from fingers. :) I'll fix this.

nvaccess / nvda

Optional ducking of other audio #3830