Possible issue with accessing audio inputs under Windows 11 Pro and Enterprise

bluebrook-sean commented 1 week ago

I have a use case experience that may (?) suggest an additional step when setting up a service wrapper under Windows 11, perhaps due to a change in permissions management. I don't work in Windows IT, and apologize if I've just missed something that should be obvious about how to use Windows 11.

I have desktop Windows software that processes audio inputs, and I'm using it for archiving two-way radio communications. In some contexts, it needs to run without a Windows user logged in, so (about a decade ago) I added XYNT as a wrapper. With Windows 11 Pro and Enterprise, the software in service mode sees the audio ports but with zero levels (but it works in desktop mode). My impression is that this is related to a change in Windows audio permissions, because the zero level symptom matches expected behavior when permissions have not been granted. I'd think LocalSystem would have permissions to do nearly anything, and I've found no way to access permissions for this user.

Because of this problem, I tried alternatives to XYNT, including shawl and Fire Daemon Pro (FDP). With FDP it works (out of the box, no special configurations entered), so I believe it is possible for a wrapper to support or enable access to audio. With shawl, I see the same symptom as under XYNT. Surprisingly, this is true even when launching the service under a Windows user where the desktop software has been granted audio input (called microphone, even when it's not a mic) permissions. I feel Windows 11 has introduced some kind of obstacle to service mode audio access, and based on my experience with FDP, overcoming this issue may be an additional step to be done in the wrapper.

Shawl is my preference going forward over XYNT (maintained - XYNT is 12 years old) and FDP (shawl is lightweight, easier to integrate). But right now I have not found a way to make this work (have the software see audio levels while running under shawl on these versions of Windows 11).

mtkennerly commented 1 week ago

Hi! Do you have the code for the radio archival program? I'm curious what APIs it's using to check the system volume.

I did a quick test, and it works for me with a service running as Local System, although I'm using Windows 11 Home and I'm logged in at the same time. Do you think it's specific to Pro and Enterprise? Does it make any difference for you if you're logged in vs logged out while the service is running?

Here's the program I used for testing. Could you give it a try, just to rule out anything specific to your archival program? It's a modified version of the shawl-child test program from this repo:

shawl-child-issue-50.zip

Set up: shawl.exe add --name shawl-audio -- C:\path\to\shawl-child.exe

For transparency, it has these changes:

diff --git a/Cargo.toml b/Cargo.toml
index ef8c291..94b09b5 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -16,6 +16,7 @@ dunce = "1.0.4"
 flexi_logger = "0.27.3"
 log = "0.4.20"
 winapi = { version = "0.3.9", features = ["consoleapi", "errhandlingapi", "winbase", "wincon"] }
+windows = { version = "0.57.0", features = ["Win32_Media_Audio", "Win32_Media_Audio_Endpoints", "Win32_System_Com"] }
 windows-service = "0.6.0"

 [dev-dependencies]
diff --git a/src/bin/shawl-child.rs b/src/bin/shawl-child.rs
index c08aebb..965adee 100644
--- a/src/bin/shawl-child.rs
+++ b/src/bin/shawl-child.rs
@@ -48,6 +48,31 @@ fn prepare_logging() -> Result<(), Box<dyn std::error::Error>> {
     Ok(())
 }

+fn get_volume() -> windows::core::Result<f32> {
+    use windows::Win32::{
+        Media::Audio::{
+            eConsole, eRender, Endpoints::IAudioEndpointVolume, IMMDeviceEnumerator,
+            MMDeviceEnumerator,
+        },
+        System::Com::{
+            CoCreateInstance, CoInitializeEx, CLSCTX_ALL, CLSCTX_INPROC_SERVER,
+            COINIT_APARTMENTTHREADED,
+        },
+    };
+
+    unsafe {
+        CoInitializeEx(None, COINIT_APARTMENTTHREADED).ok()?;
+        let device_enumerator: IMMDeviceEnumerator =
+            CoCreateInstance(&MMDeviceEnumerator, None, CLSCTX_INPROC_SERVER)?;
+        let device = device_enumerator.GetDefaultAudioEndpoint(eRender, eConsole)?;
+
+        let endpoint_volume: IAudioEndpointVolume = device.Activate(CLSCTX_ALL, None)?;
+        let volume = endpoint_volume.GetMasterVolumeLevel()?;
+
+        Ok(volume)
+    }
+}
+
 fn main() -> Result<(), Box<dyn std::error::Error>> {
     prepare_logging()?;
     info!("********** LAUNCH **********");
@@ -80,7 +105,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {

     while running.load(std::sync::atomic::Ordering::SeqCst) {
         std::thread::sleep(std::time::Duration::from_millis(500));
-        info!("Looping!");
+        info!("Volume: {:?}", get_volume());
     }

     info!("End");

And here's an excerpt of the log output from the service (0.0 is max volume, -65.25 is min volume):

2024-06-24 23:03:39 [DEBUG] stderr: "[INFO] Volume: Ok(-15.164473)"
2024-06-24 23:03:48 [DEBUG] stderr: "[INFO] Volume: Ok(-28.66002)"
2024-06-24 23:03:49 [DEBUG] stderr: "[INFO] Volume: Ok(-30.776678)"
2024-06-24 23:03:52 [DEBUG] stderr: "[INFO] Volume: Ok(-65.25)"
2024-06-24 23:03:57 [DEBUG] stderr: "[INFO] Volume: Ok(0.0)"
2024-06-24 23:03:57 [DEBUG] stderr: "[INFO] Volume: Ok(-11.885729)"

bluebrook-sean commented 1 week ago

Taking a look at the code (on the audio capture program), I think it's using WinMM. Doing a quick search on that, apparently this is deprecated in Windows 11, so that might be a cause (although it does work outside service mode). The problem seems to be specific to Pro and Enterprise, I believe it works on Home.

Yes, it does matter if logged in/out - my impression is that the service mode instance running under LocalSystem sees audio levels if the PC is logged into a Windows account that has authorized permissions for microphone access (even though the instance is not running under the logged in user account!) This complicated isolating the problem, because testing in service mode worked fine when logged in and looking at performance; it was only upon a reboot (so there was no logged in user) that it reverted to not working. (I have not tried logging out without a reboot.)

I have access to a PC with Pro where I can do testing, and I'll try out your test program tomorrow.

bluebrook-sean commented 6 days ago

I tried shawl-audio yesterday on the Windows 11 Pro system. So I could test without a Windows login, I set the service to delayed start through Windows Services properties.

The logs showed a constant audio value of -15.0 under all conditions. I grepped the logs and verified that no other values were ever recorded.

I should note this PC has four audio inputs, and I don't know which input shawl-audio was sampling. Because of that, the most revealing test was while logged into a Windows user account, viewing the audio input levels in Windows Sound (recording tab). I triggered a configuration action that makes all four audio sources generate white noise for several seconds for level calibration, and I could see the audio levels change in Windows Sound. But the shawl-audio logs showed only -15.0, despite the sample period (about 2x/second) being fast enough that it should have picked up on the increased audio levels.

I would have expected a constant value to be full quiet, and -15.0 is well above that, which I find a confusing symptom. (If the range is -64 to 0, and min/max reflect +/- variation around a zero quiet center level, I assume "zero level" would be -32.)

In my original report, I said that the symptom was the service seeing zero audio levels when no Windows user was logged in. That was too specific a claim. The level calibration step looks for expected changes in audio level, and any constant level would return an "audio not found" error.

Before posting here, I had gotten audio levels through the desktop application being wrapped while logged into a Windows user account. In contrast, shawl-audio didn't seem to see audio under the same conditions. In both cases (my original experiments and the shawl-audio test) the service was running under LocalSystem, and I'm not sure why extending microphone permissions for a specific Windows user in this way would allow one service but not the other. I had extended permissions to desktop applications in that user's microphone privacy settings, and the audio program I'm using has a GUI while shawl does not, perhaps that affects what programs are included in the granted microphone permissions. This observation may not be relevant or useful, but noting since it seems unexpected to me.

mtkennerly commented 6 days ago

Sorry, I should clarify that the shawl-child test is checking for the configured system volume (i.e., the Windows volume slider), not necessarily the level of audio that's currently playing. I know that's different from your case, but I wanted to rule out if general audio-related calls would work.

Let me see if I can update the test to check the live audio output level.

mtkennerly commented 6 days ago

Okay, here's a better test:

shawl-child-issue-50-v2.zip

It logs both the configured system volume and the current peak output value. It's checking the main/default audio endpoint - I'll try to find out how to check specific input devices. Here's a sample log from playing a brief sound effect:

2024-06-26 10:33:10 [DEBUG] stderr: "[INFO] Audio: Ok(Audio { volume: -34.75468, peak: 0.0 })"
2024-06-26 10:33:11 [DEBUG] stderr: "[INFO] Audio: Ok(Audio { volume: -34.75468, peak: 2.0359856e-34 })"
2024-06-26 10:33:11 [DEBUG] stderr: "[INFO] Audio: Ok(Audio { volume: -34.75468, peak: 0.39385164 })"
2024-06-26 10:33:12 [DEBUG] stderr: "[INFO] Audio: Ok(Audio { volume: -34.75468, peak: 0.5367883 })"
2024-06-26 10:33:12 [DEBUG] stderr: "[INFO] Audio: Ok(Audio { volume: -34.75468, peak: 0.0878225 })"
2024-06-26 10:33:13 [DEBUG] stderr: "[INFO] Audio: Ok(Audio { volume: -34.75468, peak: 0.01044412 })"
2024-06-26 10:33:13 [DEBUG] stderr: "[INFO] Audio: Ok(Audio { volume: -34.75468, peak: 0.0018324464 })"
2024-06-26 10:33:14 [DEBUG] stderr: "[INFO] Audio: Ok(Audio { volume: -34.75468, peak: 0.0 })"

Code diff

```diff diff --git a/src/bin/shawl-child.rs b/src/bin/shawl-child.rs index 965adee..8913c8b 100644 --- a/src/bin/shawl-child.rs +++ b/src/bin/shawl-child.rs @@ -48,11 +48,18 @@ fn prepare_logging() -> Result<(), Box> { Ok(()) } -fn get_volume() -> windows::core::Result { +#[derive(Debug)] +struct Audio { + volume: f32, + peak: f32, +} + +fn get_audio() -> windows::core::Result

mtkennerly commented 6 days ago

One more version 😅 This one logs the default input and output audio device volume/level for multiple categories.

shawl-child-issue-50-v3.zip

[INFO] Audio: Ok({"input-console": Audio { volume: 18.0, peak: 0.0 }, "input-multimedia": Audio { volume: 18.0, peak: 0.0 }, "input-communications": Audio { volume: 0.0, peak: 0.0 }, "output-console": Audio { volume: -34.75468, peak: 0.0061646323 }, "output-multimedia": Audio { volume: -34.75468, peak: 0.0061646323 }, "output-communications": Audio { volume: -34.75468, peak: 0.0061646323 }})

Windows documentation for the categories:

Console: "Games, system notification sounds, and voice commands."
Multimedia: "Music, movies, narration, and live music recording."
Communications: "Voice communications (talking to another person)."

Code diff

```diff diff --git a/src/bin/shawl-child.rs b/src/bin/shawl-child.rs index 8913c8b..948622d 100644 --- a/src/bin/shawl-child.rs +++ b/src/bin/shawl-child.rs @@ -1,3 +1,5 @@ +use std::collections::HashMap; + use log::info; use clap::Parser; @@ -48,16 +50,18 @@ fn prepare_logging() -> Result<(), Box> { Ok(()) } +type DeviceAudio = HashMap; + #[derive(Debug)] struct Audio { volume: f32, peak: f32, } -fn get_audio() -> windows::core::Result

bluebrook-sean commented 1 day ago

From testing, my impression is that the problem is specific to the Windows Multimedia winmm API. If I'm reading correctly, your shawl-audio test code is using the Windows Core Audio API, and appears to work (more below). There's a rumor that the Windows Multimedia API is deprecated in Windows 11, but I don't know where to verify that claim. The test using FDP shows it is possible to use winmm in service mode on Windows 11 Pro, but if deprecated, it's likely not important to support.

I changed the test environment to have a single audio input, making it easier to be sure of what level changes are expected. Copied below is part of a sample log from shawl-audio test version v3 when there was a calibration test making the audio source generate white noise for a few seconds. The test was after a reboot, with no Windows user logged in.

The input volume is reported as 30.0 consistently during this sample, while peak varies between 0 and 0.5, so I wonder if volume and peak are reversed in the capture or logging. The changes in the reported peak (low/high/low) are consistent with the expected pattern of audio level changes during the calibration test. I see -15.0 for the audio output, so I think my report of constant audio from shawl-audio v1 was seeing the output, not the input.

2024-06-26 12:52:54 [DEBUG] stderr: "[INFO] Audio: Ok({"input-multimedia": Audio { volume: 30.0, peak: 0.0019836426 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "input-communications": Audio { volume: 30.0, peak: 0.0019836426 }, "input-console": Audio { volume: 30.0, peak: 0.0019836426 }, "output-console": Audio { volume: -15.0, peak: 0.0 }})" 2024-06-26 12:52:54 [DEBUG] stderr: "[INFO] Audio: Ok({"input-communications": Audio { volume: 30.0, peak: 0.0019836426 }, "input-console": Audio { volume: 30.0, peak: 0.0019836426 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "input-multimedia": Audio { volume: 30.0, peak: 0.0019836426 }})" 2024-06-26 12:52:55 [DEBUG] stderr: "[INFO] Audio: Ok({"output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "input-console": Audio { volume: 30.0, peak: 0.0024719238 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "input-multimedia": Audio { volume: 30.0, peak: 0.0024719238 }, "input-communications": Audio { volume: 30.0, peak: 0.0024719238 }})" 2024-06-26 12:52:55 [DEBUG] stderr: "[INFO] Audio: Ok({"input-console": Audio { volume: 30.0, peak: 0.0027770996 }, "input-multimedia": Audio { volume: 30.0, peak: 0.0027770996 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "input-communications": Audio { volume: 30.0, peak: 0.0027770996 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }})" 2024-06-26 12:52:56 [DEBUG] stderr: "[INFO] Audio: Ok({"input-console": Audio { volume: 30.0, peak: 0.0020446777 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "input-multimedia": Audio { volume: 30.0, peak: 0.0020446777 }, "input-communications": Audio { volume: 30.0, peak: 0.0020446777 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "output-console": Audio { volume: -15.0, peak: 0.0 }})" 2024-06-26 12:52:56 [DEBUG] stderr: "[INFO] Audio: Ok({"output-console": Audio { volume: -15.0, peak: 0.0 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "input-console": Audio { volume: 30.0, peak: 0.002105713 }, "input-communications": Audio { volume: 30.0, peak: 0.002105713 }, "input-multimedia": Audio { volume: 30.0, peak: 0.002105713 }})" 2024-06-26 12:52:57 [DEBUG] stderr: "[INFO] Audio: Ok({"input-console": Audio { volume: 30.0, peak: 0.002960205 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "input-multimedia": Audio { volume: 30.0, peak: 0.002960205 }, "input-communications": Audio { volume: 30.0, peak: 0.002960205 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }})" 2024-06-26 12:52:57 [DEBUG] stderr: "[INFO] Audio: Ok({"input-console": Audio { volume: 30.0, peak: 0.0024108887 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "input-multimedia": Audio { volume: 30.0, peak: 0.0024108887 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "input-communications": Audio { volume: 30.0, peak: 0.0024108887 }, "output-console": Audio { volume: -15.0, peak: 0.0 }})" 2024-06-26 12:52:58 [DEBUG] stderr: "[INFO] Audio: Ok({"output-console": Audio { volume: -15.0, peak: 0.0 }, "input-console": Audio { volume: 30.0, peak: 0.0016479492 }, "input-multimedia": Audio { volume: 30.0, peak: 0.001739502 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "input-communications": Audio { volume: 30.0, peak: 0.001739502 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }})" 2024-06-26 12:52:58 [DEBUG] stderr: "[INFO] Audio: Ok({"output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "input-multimedia": Audio { volume: 30.0, peak: 0.43878174 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "input-communications": Audio { volume: 30.0, peak: 0.43878174 }, "input-console": Audio { volume: 30.0, peak: 0.43878174 }, "output-console": Audio { volume: -15.0, peak: 0.0 }})" 2024-06-26 12:52:59 [DEBUG] stderr: "[INFO] Audio: Ok({"input-multimedia": Audio { volume: 30.0, peak: 0.442688 }, "input-communications": Audio { volume: 30.0, peak: 0.442688 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "input-console": Audio { volume: 30.0, peak: 0.442688 }})" 2024-06-26 12:52:59 [DEBUG] stderr: "[INFO] Audio: Ok({"input-multimedia": Audio { volume: 30.0, peak: 0.43798828 }, "input-communications": Audio { volume: 30.0, peak: 0.43798828 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "input-console": Audio { volume: 30.0, peak: 0.43798828 }})" 2024-06-26 12:53:00 [DEBUG] stderr: "[INFO] Audio: Ok({"input-console": Audio { volume: 30.0, peak: 0.43969727 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "input-communications": Audio { volume: 30.0, peak: 0.43969727 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "input-multimedia": Audio { volume: 30.0, peak: 0.43969727 }})" 2024-06-26 12:53:00 [DEBUG] stderr: "[INFO] Audio: Ok({"output-communications": Audio { volume: -15.0, peak: 0.0 }, "input-communications": Audio { volume: 30.0, peak: 0.44232178 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "input-console": Audio { volume: 30.0, peak: 0.44232178 }, "input-multimedia": Audio { volume: 30.0, peak: 0.44232178 }})" 2024-06-26 12:53:01 [DEBUG] stderr: "[INFO] Audio: Ok({"input-multimedia": Audio { volume: 30.0, peak: 0.4401245 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "input-console": Audio { volume: 30.0, peak: 0.4401245 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "input-communications": Audio { volume: 30.0, peak: 0.4401245 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }})" 2024-06-26 12:53:01 [DEBUG] stderr: "[INFO] Audio: Ok({"output-console": Audio { volume: -15.0, peak: 0.0 }, "input-console": Audio { volume: 30.0, peak: 0.45022583 }, "input-multimedia": Audio { volume: 30.0, peak: 0.45022583 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "input-communications": Audio { volume: 30.0, peak: 0.45022583 }})" 2024-06-26 12:53:02 [DEBUG] stderr: "[INFO] Audio: Ok({"output-console": Audio { volume: -15.0, peak: 0.0 }, "input-multimedia": Audio { volume: 30.0, peak: 0.44198608 }, "input-communications": Audio { volume: 30.0, peak: 0.44198608 }, "input-console": Audio { volume: 30.0, peak: 0.44198608 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }})" 2024-06-26 12:53:02 [DEBUG] stderr: "[INFO] Audio: Ok({"output-communications": Audio { volume: -15.0, peak: 0.0 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "input-console": Audio { volume: 30.0, peak: 0.44122314 }, "input-multimedia": Audio { volume: 30.0, peak: 0.44122314 }, "input-communications": Audio { volume: 30.0, peak: 0.44122314 }})" 2024-06-26 12:53:03 [DEBUG] stderr: "[INFO] Audio: Ok({"input-console": Audio { volume: 30.0, peak: 0.4373474 }, "input-multimedia": Audio { volume: 30.0, peak: 0.4373474 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "input-communications": Audio { volume: 30.0, peak: 0.4373474 }})" 2024-06-26 12:53:03 [DEBUG] stderr: "[INFO] Audio: Ok({"output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "input-console": Audio { volume: 30.0, peak: 0.0024108887 }, "input-multimedia": Audio { volume: 30.0, peak: 0.0024108887 }, "input-communications": Audio { volume: 30.0, peak: 0.0024108887 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }})" 2024-06-26 12:53:04 [DEBUG] stderr: "[INFO] Audio: Ok({"output-communications": Audio { volume: -15.0, peak: 0.0 }, "input-multimedia": Audio { volume: 30.0, peak: 0.002532959 }, "input-console": Audio { volume: 30.0, peak: 0.002532959 }, "input-communications": Audio { volume: 30.0, peak: 0.002532959 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }})" 2024-06-26 12:53:04 [DEBUG] stderr: "[INFO] Audio: Ok({"input-multimedia": Audio { volume: 30.0, peak: 0.001739502 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "input-communications": Audio { volume: 30.0, peak: 0.001739502 }, "input-console": Audio { volume: 30.0, peak: 0.001739502 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }})" 2024-06-26 12:53:05 [DEBUG] stderr: "[INFO] Audio: Ok({"input-multimedia": Audio { volume: 30.0, peak: 0.0018005371 }, "output-communications": Audio { volume: -15.0, peak: 0.0 }, "input-communications": Audio { volume: 30.0, peak: 0.0018005371 }, "output-console": Audio { volume: -15.0, peak: 0.0 }, "output-multimedia": Audio { volume: -15.0, peak: 0.0 }, "input-console": Audio { volume: 30.0, peak: 0.0018005371 }})"

mtkennerly commented 22 hours ago

I wonder if volume and peak are reversed in the capture or logging

That's just what it's called in the Windows API calls I was using. "Volume" is the system volume slider configuration, and "peak value" is the loudness of the current playing audio. It threw me off at first too.

From testing, my impression is that the problem is specific to the Windows Multimedia winmm API. If I'm reading correctly, your shawl-audio test code is using the Windows Core Audio API, and appears to work (more below).

Gotcha, at least we've ruled out it being an issue with all audio in general :+1:

I'm not familiar with the WinMM API and haven't found a good guide so far. Do you know which specific API calls the audio capture program is using to check the audio level? If you do, I can try adding the same calls in the shawl-child test.

bluebrook-sean commented 9 hours ago

The API calls are using mciSendStringA, which is documented here: (the A suffix is for an ANSI implementation)

https://learn.microsoft.com/en-us/previous-versions/dd757161(v=vs.85)

And the Windows Multimedia commands that can be sent via mciSendStringA are documented here:

https://learn.microsoft.com/en-us/windows/win32/multimedia/multimedia-command-strings?redirectedfrom=MSDN

The series of commands sent via mciSendStringA appears to be: ("arbitrary" = user specified name for an opened instance)

open new type waveaudio alias arbitrary set arbitrary bitspersample 16 channels 1 alignment 2 samplespersec 22050 format tag pcm wait status arbitrary level close arbitrary

The first two initialize, then status can be periodically sampled with the instance left open, and close on program exit.

The full code also runs additional mciSendStringA commands to select a specific audio input, and to save audio to file. The step of selecting an audio input is optional, and uses a default Windows selection if not specified, which seems fine for testing.

Aside from mciSendStringA, there is one other winmm action the code takes, which I think is working: waveInGetNumDevs

https://learn.microsoft.com/en-us/windows/win32/api/mmeapi/nf-mmeapi-waveingetnumdevs?redirectedfrom=MSDN

When diagnosing the issue originally, I believe I observed that the correct number of audio inputs were detected, but auto detection of the correct audio input was failing (this is done with the calibration process commanding a particular input to generate a pattern of white noise, and then recognizing the pattern from sampling audio levels on all audio inputs). This is why I had interpreted this as a possible permissions issue, since my understanding of Windows 11 privacy is that when software does not have permission to access a microphone, the audio input appears to exist but audio levels are silent. But for testing winmm, waveInGetNumdevices makes an interesting contrast, because getting an accurate count of audio devices implies that winmm itself is accessible and responding, the problem would be deeper in the audio system. If winmm was deprecated by Microsoft, I wonder if winmm may not be interfacing correctly with modified microphone permission logic in Windows 11?

mtkennerly / shawl

Possible issue with accessing audio inputs under Windows 11 Pro and Enterprise #50