libsdl-org / SDL

Simple Directmedia Layer
https://libsdl.org
zlib License
9.42k stars 1.75k forks source link

Audio Capture from source with 7 channels #4458

Closed hanseuljun closed 2 years ago

hanseuljun commented 3 years ago

Unfortunately, Azure Kinect, a device from Microsoft has 7 channels for its microphone... While it is not supported in SDL.

Inside SDL_BuildAudioCVT() of audio/SDL_audiocvt.c, seems like conversions between 1, 2, 4, 8 channels are supported, but there is no support for 7, which seems like the reason why SDL is not supporting 7 channels.

Would there be a way I can still use SDL for this case with 7 channels? I don't need the conversion between audio channels as having 7 channels would be perfectly fine for me.

Or maybe I should try implementing the conversion to and from 7 channels? I'm no audio expert (very afraid of doing the math wrong here) but in case this really is what I should do, I can give it a try...

icculus commented 3 years ago

So 7 channels is kinda weird; the reason we handle 6 and 8 is because they are well-defined in space (left, right, front, back, etc), but if this is a special case just for this one microphone, then we need to figure out what each channel represents and how to map it to something meaningful.

(I don't think we've ever seriously tested with a stereo microphone, let alone a surround-sound input, fwiw.)

hanseuljun commented 3 years ago

That sounds like this case is too specific for SDL. I guess I should add another library/dependency for audio to my project...

slouken commented 3 years ago

It's not too specific, we just don't have the hardware for testing. If you know what the 7 channel mapping is, we can expose it. Are there docs for it?

slouken commented 3 years ago

@icculus, is there a reason we don't just pass back 7 channels? Why would we need to do any conversion if the audio device open call allows returning the hardware audio format?

icculus commented 3 years ago

SDL doesn't currently have a concept of 7 channels, so it's possible this could go wrong in 20 different ways.

Is there some documentation on what each of these 7 channels represent? Is it the same as a 7.1 config without the subwoofer channel? If that's the case the conversion is simple.

hanseuljun commented 3 years ago

That would be more than wonderful! The 7 channel mapping of Azure Kinect can be found from the right side of the first figure from the below article: https://docs.microsoft.com/en-us/azure/Kinect-dk/hardware-specification

Mic #0: Center (Towards the ceiling) Mic #1: Front Mic #2: Front-Right Mic #3: Rear-Right Mic #4: Rear Mic #5: Rear-Left Mic #6: Front-Left

The six microphones expect for Mic #0 form a circle, with 60 degrees apart from each other from the center of the circle, which is where Mic #0 is at.

And, of course if this is an odd case, exposure without conversion would work wonderfully too. Also, I really appreciate SDL that not adding this thing will not change my mind!

1bsyl commented 2 years ago

@hanseuljun Not sure if this would help, but you can try this PR https://github.com/libsdl-org/SDL/pull/4974

hanseuljun commented 2 years ago

@1bsyl Wow, this is a really pleasant surprise. Of course I will!!

slouken commented 2 years ago

Fixed in #4974, thanks!