kcat / openal-soft

OpenAL Soft is a software implementation of the OpenAL 3D audio API.
Other
2.09k stars 517 forks source link

[Feature request] Spatial renderer for Microsoft Spatial Sound API #586

Open ThreeDeeJay opened 2 years ago

ThreeDeeJay commented 2 years ago

image OpenAL Soft as a spatial sound option via the Windows 10's built-in API could (in theory) allow:

kcat commented 2 years ago

You mean as some kind of "driver" for the Spatial Audio API, such that any app using that API will implicitly use OpenAL Soft's renderer under the hood? Is there any information about how to make one of those?

ThreeDeeJay commented 2 years ago

Exactly. And I'm not sure this is enough information to actually implement it, but it seems relevant:

Microsoft Spatial Sound and Audio Middleware Many app and game developers use third party audio rendering engine solutions, which often include sophisticated authoring and auditioning tools. Microsoft has partnered with several of these solution providers to implement Microsoft Spatial Sound in their existing authoring environments. This will frequently mean the APIs discussed here are abstracted from the app’s view; they are wrapped as digital signal processing (DSP) plug-ins that the app can instantiate, and which the app’s audio implementer can use to mix to a Microsoft Spatial Sound channel bed, submix, or send individual voices to dynamic object instance plug-ins as desired. Consult with your audio middleware solution provider for their level of support for Microsoft Spatial Sound.

Render Spatial Sound Using Spatial Audio Objects This article presents some simple examples that illustrate how to implement spatial sound using static spatial audio objects, dynamic spatial audio objects, and spatial audio objects that use Microsoft's Head Relative Transfer Function (HRTF). The implementation steps for all three of these techniques are very similar and this article provides a similarly structured code example for each technique. For complete end-to-end examples of real-world spatial audio implementations, see Microsoft Spatial Sound samples github repository. For an overview of Windows Sonic, Microsoft’s platform-level solution for spatial sound support on Xbox and Windows, see Spatial Sound.

What is Project Acoustics? Project Acoustics is a wave acoustics engine for 3D interactive experiences. It models wave effects like occlusion, obstruction, portaling and reverberation effects in complex scenes without requiring manual zone markup or CPU intensive raytracing. It also includes game engine and audio middleware integration. Project Acoustics' philosophy is similar to static lighting: bake detailed physics offline to provide a physical baseline, and use a lightweight runtime with expressive design controls to meet your artistic goals for the acoustics of your virtual world.

They seem to focus on implementing Microsoft Spatial Sound API into apps (usually via Unity or Unreal engines) as opposed to implementing a new "Format" (like Windows Sonic/Atmos for headphones/DTS Headphones:X) but it might offer some insight into the feasibility of an OpenAL Soft renderer. It also mentions Microsoft partnering with developers so it'd probably be a good idea getting in touch with them. I could open an issue on their github repo but I'm not experienced enough to know what exactly to ask for. Otherwise if you feel it's not worth the effort, feel free to close this issue. 👍

kcat commented 2 years ago

Those seem to be talking about using the Spatial Audio APIs for output, not for creating something the system will use for Spatial Audio rendering.

Many app and game developers use third party audio rendering engine solutions, which often include sophisticated authoring and auditioning tools. [...] This will frequently mean the APIs discussed here are abstracted from the app’s view; they are wrapped as digital signal processing (DSP) plug-ins that the app can instantiate

This means things like Wwise or FMod. You can add them to a Unity or Unreal Engine project for audio support, or you can call their APIs directly. They would have plugins to output through to the Spatial Audio API, rather than mixing for plain WASAPI or DirectSound. In OpenAL Soft terms, this would be like adding a Spatial Audio API backend. It's not talking about making the system use those solutions as a spatial audio renderer next to Windows Sonic or Dolby Atmos.

ThreeDeeJay commented 2 years ago

Ahh I see, that's unfortunate. Guess the only way is to get help from Microsoft developers directly involved or at least knowledgeable about Microsoft Spatial Sound. I just hope they don't require some sort of special licensing or royalties for using their API. Anyhow, I don't expect this anytime soon but I'll keep an eye out if there's any relevant info or progress. In the meantime, perhaps the most practical option would be to somehow swap the impulse responses of the built-in HRTFs with a better one.

mirh commented 1 year ago

I don't think spatial audio sits separately from normal WASAPI Ok nvm ISAC is another thing completely Aside of that, it's also already working with openal anyway. Ok supposedly that may just be a virtual fallback.

ThreeDeeJay commented 1 year ago

Yeah, XAudio 2.9 is one of the only non-spatial APIs I know that seems to interface explicitly with MSSAPI, but only for virtual surround.

Starting with the Windows 10 1903 update, XAudio 2.9 automatically uses virtual surround sound, if certain conditions are satisfied. We recommend testing game that generate multi-channel sound on Windows 10 1903 (or newer) to verify that the game sounds as expected. [...] XAudio 2.9 will only use the user's selected spatial sound format if the process that is using the XAudio2 API is recognized as a game by the Windows Game Bar. During development, it is possible that the process is not yet recognized as a game by the Game Bar. To change this, use the Win+G keyboard short-cut to bring up the Game Bar while the game is running. Then click on the "Settings" icon and check the checkbox that says, "Remember that this is a game". https://docs.microsoft.com/en-us/windows/win32/xaudio2/xaudio2-redistributable

Which is a shame, because apparently XAudio 2 could be capable of 3D audio. I just found out that the HRTF files I mentioned on my previous message aren't just Sonic's but also XAudio 2 HRTF XAPO's. If I rename "C:\Windows\System32\DefaultHrtfs.bin", Spatial audio sample prints an error and Managed XAudio2 Hrtf Test as well as Spatial Sound tests here refuse to run.

That said, MSSAPI can also virtualize surround output by apps using other APIs. I just tested MPC-HC and its surround output can be spatialized by Sonic, and I think it uses WASAPI, so perhaps OpenAL and other APIs might be supported the same way, at least unofficially or the compatibility was eventually expanded to other APIs. I think MSSAPI manages to capture the app/game's surround sound output before it gets the chance to be downmixed since MSSAPI requires stereo. But apparently that comes with the cost of not being able to virtualize apps/games that only output surround if it detects a surround configuration, in which case we'd be better off with some virtual surround software.

Anyhow, I emailed a few Microsoft/spatial audio developers asking for documentation and one might be able to help, so hopefully we can find some documentation that may make this FR feasible sometime in the future 🤞

hl4hck commented 1 year ago

@ThreeDeeJay Hello, I have a question regarding this old issue.

I would like to create an app similar to "DTS Headphone:X" or "Dolby Atmos for Headphones." Are there any related MSSAPI or Windows Driver samples available?

I have been searching for a few days, but I couldn't find any relevant samples. Do you have any hints on how to approach this?

ThreeDeeJay commented 1 year ago

@hl4hck There are plenty of samples to test the active spatial audio renderer like Windows Sonic, but unfortunately I still haven't found any documentation on how to actually develop a custom spatial renderer, and I didn't get a single response from any of the Microsoft Spatial Sound developers I emailed about it. 😔

mirh commented 1 year ago

I downloaded the DTS Unbound and Dolby Access packages, and right off of the bat you can tell from the apps manifests that they "identify" Headphone:X and Atmos for Headphones as a specific dll that is registered as audioEncoder. After digging into the parameters, I could indeed find there's a media subtype named MFAudioFormat_Float_SpatialObjects (which was added around the time of other atrociously undocumented definitions).

After 9 hours of decompiler switch analysis in ghidra.. I'll grant that I couldn't really find anything else specific then (maybe I should have focused on the windows dlls that implements windows sonic, which at least have their symbols available) But I'm very confident that the other part of the equation is implementing some kind of custom APO (DTSHPXV2Apo4xWinRTComponent.dll and Dolby Audio Processing for Microsoft Spatial Audio kinda drop you a hint) There's some scant reference to CAPx here and there, but I believe that's not a hard requirement.

So... I don't think HeSuVi is really that far from it already? The biggest difficulty should just be how to get it registered properly.. If it's actually even possible at all. Did somebody ever find any spatial format other than dolby or dts (or sonic)?

Because if you check the strings in some windows audio library, you can find that those are hardcoded/built-in out of the box:

SystemSettings_Audio_Speaker_Stereo
SystemSettings_Audio_Speaker_FiveOne
SystemSettings_Audio_Speaker_SevenOne
SystemSettings_Audio_Speaker_DolbyHeadphones
SystemSettings_Audio_Speaker_DolbyTheater
SystemSettings_Audio_Speaker_WindowsSonic
SystemSettings_Audio_Speaker_DTSHeadphone
SystemSettings_Audio_Speaker_DTSTwoSpeaker

But on the other hand, the snippets just afterwards are:

SystemSettings::DataModel::SpatialEndpointOptionHandler::SpatialEndpointOptionHandler
Updating spatial menu value
SystemSettings::DataModel::SpatialEndpointOptionHandler::Invoke
SystemSettings::DataModel::SpatialEndpointOptionHandler::GetProperty
SystemSettings::DataModel::SpatialEndpointOptionHandler::SetProperty
SystemSettings::DataModel::SpatialEndpointOptionHandler::SetSpatialTech
SystemSettings::DataModel::SpatialActionSettingWrapper::GetProperty

Yet I got my hands on a set of razer kraken v3 hypersense cans yesterday, and to my dismay its THX Spatial Audio was a separate "driver operation mode" as opposed to a knob into the spatial audio window. And they are definitively no small indie company. By all means, it could just be that's the more or less legacy way razer still prefers to do things, but I'm extremely uneasy with the amount of mystery surrounding the topic.

ridingtheflow commented 12 months ago

Sounds like its not API which supposed to be public. I assume it made for explicit licensees like Dolby and DTS and wasn't really designed to be opened for any third-party.

Does not mean it still can't be used, but unfortunately looks like it will be lot of hassle - e.g. it might need a kludge to support processing mode on some of pre-existing hardcoded "formats"/"Surround Types" instead of properly adding own custom format.

mirh commented 12 months ago

https://github.com/wine-mirror/wine/commit/7e64247a6ef3664a603bd9d2dd47f46d24713f9d this may help I guess? I don't think andrew can be bothered anymore with windows internals (as should I) but maybe @ivyl has a few clues..

ThreeDeeJay commented 11 months ago

Does not mean it still can't be used, but unfortunately looks like it will be lot of hassle - e.g. it might need a kludge to support processing mode on some of pre-existing hardcoded "formats"/"Surround Types" instead of properly adding own custom format.

To be honest, I would not think twice before replacing Sonic, at the very least its awful HRTF.

Did somebody ever find any spatial format other than dolby or dts (or sonic)?

I've been informed about an upcoming spatial audio implementation currently being developed, but until they go public, I don't think I'm allowed to "spoil the surprise". I also sent them an email inquiring about the development details but still haven't heard back from them. 😔

ivyl commented 11 months ago

wine-mirror/wine@7e64247 this may help I guess? I don't think andrew can be bothered anymore with windows internals (as should I) but maybe @ivyl has a few clues..

Hi. This is a naive implementation of https://learn.microsoft.com/en-us/windows/win32/api/spatialaudioclient/

We don't implement the dynamic objects at all always returning 0. Instead we only expose the part that handles static audio objects with the usual front/side/back/top/etc. positioning..

If you would want to implement your own version you would need to hook IMMDevice::Activate and replace the IID_ISpatialAudioClient. I not aware of any other way to register your own handler / mode for it.

I don't think anyone on the Wine projects knows anything more about this API and especially about how it works internally on Windows.

mirh commented 11 months ago

I see It kinda feels like there's enough secrecy that prototyping first something in wine could be actually the only sane way forward (also, now that I think to it, it's not like proton shouldn't have to ship openal-soft anyway for the best experience on older games)

mirh commented 8 months ago

Is 0eeb3c29f412dd38666dc57e9bc117e3e1073d78 about deferring to the system built-in providers?

ThreeDeeJay commented 8 months ago

It seems related to using the 7.1.4 output mode https://github.com/kcat/openal-soft/commit/e9ad8571ba93dd6631a9c05a05a28ede95728d9e Maybe so that people with Dolby Atmos for Home Theater can get a 3D positional mix in games using OpenAL Soft? 🤔 Otherwise I don't see any benefit from using headphone spatializers other than OpenAL Soft's, unless for those who really like the Dolby Atmos for Headphones, DTS Headphone:X or god forbid Windows Sonic HRTFs 🤢

kcat commented 8 months ago

Somehow my entire response disappeared. Oh well.

It seems related to using the 7.1.4 output mode https://github.com/kcat/openal-soft/commit/e9ad8571ba93dd6631a9c05a05a28ede95728d9e Maybe so that people with Dolby Atmos for Home Theater can get a 3D positional mix in games using OpenAL Soft? 🤔

Essentially. As I understand it, Atmos/Auro3D/etc setups aren't presented with the extra height speakers normally to apps. A 7.1.4 setup would be seen as 7.1, 5.1.2 would be seem as 5.1, etc. You need to enable (paid/licensed?) software that lets WASAPI's spatial audio API mix the static and dynamic audio objects to all the available speakers, from what I understand.

While not ideal to rely on extra paid software (if free and open source alternatives can't be made), there is the benefit that it doesn't need decoders for every possible combination: 7.1.4, 5.1.4, 7.1.2, and 5.1.2 (and 7.1.1 and 5.1.1?). As it is, the 7.1.4 decoder is quite wonky and uneven and probably doesn't sound great, due to being a dome with no lower hemisphere, so trying to make decoders for even less speakers with less coverage will be more difficult. 7.1.4.4 would actually be better to feed the spatial API with since it's a better decode and the system can mix down the missing channels to best fit what the user has, but I'm not sure if that's guaranteed to be supported (I'm also not sure 7.1.4 will always work; I don't have such a system to test with myself, and I haven't gotten any feedback from anyone who has).

ThreeDeeJay commented 8 months ago

I see. so would the spatial audio API encode 7.1.4 into the 7.1 core like Dolby Digital Plus/TrueHD tracks with Atmos available on some blu-ray discs or would HDMI be capable of lossless discrete channels?

Audio
ID                          : 2
ID in the original source m : 4352 (0x1100)
Format                      : MLP FBA 16-ch
Format/Info                 : Meridian Lossless Packing FBA with 16-channel presentation
Commercial name             : Dolby TrueHD with Dolby Atmos
Codec ID                    : A_TRUEHD
Duration                    : 2 h 34 min
Bit rate mode               : Variable
Bit rate                    : 4 298 kb/s
Maximum bit rate            : 6 528 kb/s
Channel(s)                  : 8 channels
Channel layout              : L R C LFE Ls Rs Lb Rb
Sampling rate               : 48.0 kHz
Frame rate                  : 1 200.000 FPS (40 SPF)
Bit depth                   : 24 bits
Compression mode            : Lossless
Stream size                 : 4.64 GiB (13%)
Title                       : Surround 7.1
Language                    : English
Default                     : Yes
Forced                      : No
Original source medium      : Blu-ray
Number of dynamic objects   : 11
Bed channel count           : 1 channel
Bed channel configuration   : LFE

Oh and by the way, DTS:X for Home Theater (HDMI) supports 8.1.4.4 https://learn.microsoft.com/en-us/windows/win32/coreaudio/spatial-sound#microsoft-spatial-sound-runtime-resource-implications

support for up to 8.1.4.4 channels (8 channels around the listener – Left, Right, Center, Side Left, Side Right, Back Left, Back Right, and Back Center; 1 low frequency effects channel; 4 channels above the listener; 4 channels below the listener). https://hydrogenaud.io/index.php?topic=115236.0

kcat commented 8 months ago

I see. so would the spatial audio API encode 7.1.4 into the 7.1 core like Dolby Digital Plus/TrueHD tracks with Atmos available on some blu-ray discs or would HDMI be capable of lossless discrete channels?

I don't know the technical details, but I believe the channels' samples are packed in a (compressed?) bitstream that's sent over HDMI, which the receiver can unpack and mix for the available output speakers. It's not encoded into a raw 7.1 stream for the receiver to decode, I don't believe.

mirh commented 8 months ago

You are 100% right (and indeed how else could it be?) https://help.ea.com/in/help/star-wars/star-wars-battlefront/troubleshoot-dolby-atmos-sound-in-star-wars-battlefront1/

junh1024 commented 1 week ago

3DJ & mirh, The way DA works on BD vs gaming is different. I think this is the way it works: