ValveSoftware / steam-audio

Steam Audio
https://valvesoftware.github.io/steam-audio/
Apache License 2.0
2.31k stars 163 forks source link

how does steam audiuo work in ambisonic decode functions #267

Open WHUfreeway opened 1 year ago

WHUfreeway commented 1 year ago

i found that in most article about ambisonc decode, instead of directly convolving the ambionic sound source and hrtf, they use an intermediate state with multiple speakers array. TThey first decode the ambionic into a multi speaker layout and then add hrtf to the multi speaker layout. I don't know how Steam Audio mixes an ambionic sound source with hrtf to form a binaural sound source, so im quite interested about the internal implementation of Steam Audio's ambionic decoding function.Like when using C api, hrtf and speaker array are given at the same time. in ambiqual 2022 they use a 26-point Lebedev Quadrature layout to decode 3rd order ambisonics : 3.4.3. Conditions Ambisonic B-format content audio signals were encoded using the Opus 1.2 codec with channel mapping family 2 implementation [11] at a variety of bit rates to produce a range of conditions. These signals were then rendered to a binaural format for headphone presentation using a generic head related transfer function (HRTF). For FOA examples, a Neumann KU 100 binaural dummy head (SADIE subject 2) with a cube layout was used. For 3OA examples, the same KU 100 binaural dummy head using a 26-point Lebedev Quadrature layout was used with the angles presented in Table 2. The layout and post-processing procedure followed for both 1st and 3rd order HRTFs are described on the SADIE project website (https://www.york.ac.uk/sadie-project/GoogleVRSADIE. html). Head symmetry optimization (assuming the ears are reverse-identical filters) was applied by Appl. Sci. 2020, 10, 3188 7 of 21 inverting the L/R HRTFs around the head to save on computation. Negligible differences were found between the original HRTFs and these symmetrical versions using this technique as outlined in [29]. Rendered audio content signals (i.e., test samples) were created for a range of conditions. The original uncompressed audio, 3OA, serves as both the “Reference” condition and the hidden reference for this MUSHRA test. Third-order Ambisonics (3OA) audio encoded with 512 and 256 kbps serve as conditions 3OA512 and 3OA256, respectively. First-order Ambisonics audio (FOA) encoded at 128 and 64 kbps serve as conditions FOA128 and FOA64, respectively. Finally, condition FOA32 was used as the hidden anchor for testing and represents first-order Ambisonic audio encoded at 32 kbps. Details of encoding schemes and bit rates used in Experiment 1 can be found in Table 3

lakulish commented 1 year ago

Steam Audio uses a 24-point spherical t-design to decode Ambisonic signals into a virtual speaker layout. These are then either panned to the user's speaker layout (if the binaural parameter is IPL_FALSE) or rendered with HRTF (if binaural is IPL_TRUE). We also pre-compute speaker matrices / HRTFs so that we don't have to e.g. apply 24 HRTFs, but only as many HRTFs as there are Ambisonic channels, but this is just an optimization.