ValveSoftware / steam-audio

Steam Audio
https://valvesoftware.github.io/steam-audio/
Apache License 2.0
2.2k stars 152 forks source link

[C API]fail when using iplAmbisonicsDecodeEffectApply #261

Closed WHUfreeway closed 11 months ago

WHUfreeway commented 1 year ago

test version: steam audio 4.2.0 windows x64

code: ``` #include #include #include #include #include #include #include #include #include "utils.h" std::vector load_input_audio(const std::string filename) { std::ifstream file(filename.c_str(), std::ios::binary); file.seekg(0, std::ios::end); auto filesize = file.tellg(); auto numsamples = static_cast(filesize / sizeof(float)); std::vector inputaudio(numsamples); file.seekg(0, std::ios::beg); file.read(reinterpret_cast(inputaudio.data()), filesize); return inputaudio; } void save_output_audio(const std::string filename, std::vector outputaudio) { std::ofstream file(filename.c_str(), std::ios::binary); file.write(reinterpret_cast(outputaudio.data()), outputaudio.size() * sizeof(float)); } int main() { auto inputaudio = load_input_audio("bird.raw"); // ******************固定格式 IPLContextSettings contextSettings{}; contextSettings.version = STEAMAUDIO_VERSION; IPLContext context{}; iplContextCreate(&contextSettings, &context); // ******************固定格式 auto const samplingrate = 44100; auto const framesize = 512; IPLAudioSettings audioSettings{ samplingrate, framesize }; IPLHRTFSettings hrtfSettings; //hrtfSettings.type = IPL_HRTFTYPE_DEFAULT; hrtfSettings.type = IPL_HRTFTYPE_SOFA; hrtfSettings.sofaFileName = "H20_48K_24bit_256tap_FIR_SOFA.sofa"; //IPLHRTF hrtf{}; IPLHRTF hrtf = nullptr; IPLerror hrtferr = iplHRTFCreate(context, &audioSettings, &hrtfSettings, &hrtf); IPLAmbisonicsEncodeEffectSettings effectSettings{}; effectSettings.maxOrder = 3; // 2nd order Ambisonics (9 channels) IPLAmbisonicsEncodeEffect effect = nullptr; iplAmbisonicsEncodeEffectCreate(context, &audioSettings, &effectSettings, &effect); std::vector outputaudioframe(9 * framesize); std::vector outputaudio; auto numframes = static_cast(inputaudio.size() / framesize); float* inData[] = { inputaudio.data() }; IPLAudioBuffer inBuffer{ 1, audioSettings.frameSize, inData }; // { numChannels, numSamples, inData} IPLAudioBuffer outBuffer; iplAudioBufferAllocate(context, 9, audioSettings.frameSize, &outBuffer); for (auto i = 0; i < numframes; ++i) { IPLAmbisonicsEncodeEffectParams params{}; params.direction = IPLVector3{ 1.0f, 0.0f, 0.0f }; params.order = 2; IPLAudioEffectState ret = iplAmbisonicsEncodeEffectApply(effect, ¶ms, &inBuffer, &outBuffer); iplAudioBufferInterleave(context, &outBuffer, outputaudioframe.data()); std::copy(std::begin(outputaudioframe), std::end(outputaudioframe), std::back_inserter(outputaudio)); inData[0] += audioSettings.frameSize; } save_output_audio("Ambisonicoutputaudio.raw", outputaudio); iplAudioBufferFree(context, &outBuffer); //*************************************decode std::vector doutputaudioframe(2 * framesize); std::vector doutputaudio; float* dinData[] = { outputaudio.data() }; IPLAudioBuffer dinBuffer{ 9, audioSettings.frameSize, dinData }; // { numChannels, numSamples, inData} IPLAudioBuffer doutBuffer; iplAudioBufferAllocate(context, 2, audioSettings.frameSize, &doutBuffer); IPLAmbisonicsDecodeEffectSettings deffectSettings{}; deffectSettings.maxOrder = 3; deffectSettings.hrtf = hrtf; IPLAmbisonicsDecodeEffect deffect = nullptr; IPLerror err = iplAmbisonicsDecodeEffectCreate(context, &audioSettings, &deffectSettings, &deffect); IPLCoordinateSpace3 listenerCoordinates = IPLCoordinateSpace3{ IPLVector3{ 1.0f, 0.0f, 0.0f } ,IPLVector3{ 0.0f, 1.0f, 0.0f } ,IPLVector3{ 1.0f, 0.0f, 0.0f } ,IPLVector3{ 1.0f, 0.0f, 0.0f } }; // the listener's coordinate system for (auto i = 0; i < numframes; ++i) { IPLAmbisonicsDecodeEffectParams dparams{}; dparams.order = 2; dparams.hrtf = hrtf; dparams.orientation = listenerCoordinates; dparams.binaural = IPL_TRUE; IPLAudioEffectState ret2 = iplAmbisonicsDecodeEffectApply(deffect, &dparams, &dinBuffer, &doutBuffer); iplAudioBufferInterleave(context, &doutBuffer, doutputaudioframe.data()); std::copy(std::begin(doutputaudioframe), std::end(doutputaudioframe), std::back_inserter(doutputaudio)); dinData[0] += audioSettings.frameSize; } iplAudioBufferFree(context, &doutBuffer); iplHRTFRelease(&hrtf); iplContextRelease(&context); for (auto i = 0; i < 30000;++i) { if (std::isnan(outputaudio[i])) { std::cout << "值等于 NaN" << std::endl; } } save_output_audio("outputaudioA.raw", outputaudio); return 0; } ```

image this is intermediate file when generating the 2rd -order ambisonic error occures when calling IPLerror err = iplAmbisonicsDecodeEffectCreate(context, &audioSettings, &deffectSettings, &deffect); 0x00007FFE0D4FEA07 (phonon.dll)处(位于 call_phonon.exe 中)引发的异常: 0xC0000005: 读取位置 0xFFFFFFFFFFFFFFFF 时发生访问冲突。 this error waste me such time that i wish some help here

WHUfreeway commented 1 year ago

i changed soundsource to 48000 sample rate with float32 bit depth, the ambisonic encode function returned correct sound. but decode function still dont work.

lakulish commented 1 year ago

Looking at the code above, it looks like the input buffer passed to the Ambisonics decode effect (dinBuffer) points to the interleaved buffer containing the (interleaved) output of the Ambisonics encode effect. However, all Steam Audio effects require deinterleaved input. Can you try one of the following:

  1. Deinterleave the contents of outputaudio, using iplAudioBufferDeinterleave, before passing it to iplAmbisonicsDecodeEffectApply, or
  2. Combine the two loops: in the loop where you call iplAmbisonicsEncodeEffectApply, also call iplAmbisonicsDecodeEffectApply to obtain a single frame of output from the Ambisonics decode effect, then accumulate it into doutputaudio, and save both outputaudio and doutputaudio at the end of the loop.

Let us know if neither of the above helps, or if you have any other questions.

WHUfreeway commented 1 year ago

thank you! it truly solved my problem. But there are some new.

  1. I conbined iplAmbisonicsEncodeEffectApply with iplAmbisonicsDecodeEffectApply. it can generate an output. but the output audio may have little problem. the left and right channel's sound are same.
  2. I use IPLAudioEffectState ret3 = iplAmbisonicsDecodeEffectApply(deffect, &dparams, &outBuffer, &doutBuffer); to check if the function correctly runned. and the return value is IPL_AUDIOEFFECTSTATE_TAILREMAINING(0). image image
my modified code here ``` #include #include #include #include #include #include #include #include #include "utils.h" int main() { auto inputaudio = load_input_audio("mono.raw"); // ******************固定格式 IPLContextSettings contextSettings{}; contextSettings.version = STEAMAUDIO_VERSION; IPLContext context{}; iplContextCreate(&contextSettings, &context); // ******************固定格式 auto const samplingrate = 48000; auto const framesize = 512; IPLAudioSettings audioSettings{ samplingrate, framesize }; IPLHRTFSettings hrtfSettings; hrtfSettings.type = IPL_HRTFTYPE_DEFAULT; /* hrtfSettings.type = IPL_HRTFTYPE_SOFA; hrtfSettings.sofaFileName = "H20_48K_24bit_256tap_FIR_SOFA.sofa"; */ //IPLHRTF hrtf{}; IPLHRTF hrtf = nullptr; IPLerror hrtferr = iplHRTFCreate(context, &audioSettings, &hrtfSettings, &hrtf); IPLAmbisonicsEncodeEffectSettings effectSettings{}; effectSettings.maxOrder = 3; // 2nd order Ambisonics (9 channels) IPLAmbisonicsEncodeEffect effect = nullptr; iplAmbisonicsEncodeEffectCreate(context, &audioSettings, &effectSettings, &effect); std::vector outputaudioframe(2 * framesize); std::vector outputaudio; auto numframes = static_cast(inputaudio.size() / framesize); float* inData[] = { inputaudio.data() }; IPLAudioBuffer inBuffer{ 1, audioSettings.frameSize, inData }; // { numChannels, numSamples, inData} IPLAudioBuffer outBuffer; iplAudioBufferAllocate(context, 16, audioSettings.frameSize, &outBuffer); std::vector doutputaudioframe(2 * framesize); std::vector doutputaudio; float* dinData[] = { outputaudio.data() }; IPLAudioBuffer dinBuffer{ 9, audioSettings.frameSize, dinData }; // { numChannels, numSamples, inData} IPLAudioBuffer doutBuffer; iplAudioBufferAllocate(context, 2, audioSettings.frameSize, &doutBuffer); IPLAmbisonicsDecodeEffectSettings deffectSettings{}; deffectSettings.maxOrder = 3; deffectSettings.hrtf = hrtf; IPLAmbisonicsDecodeEffect deffect = nullptr; IPLerror err = iplAmbisonicsDecodeEffectCreate(context, &audioSettings, &deffectSettings, &deffect); IPLCoordinateSpace3 listenerCoordinates = IPLCoordinateSpace3{ IPLVector3{ 1.0f, 0.0f, 0.0f } ,IPLVector3{ 0.0f, 1.0f, 0.0f } ,IPLVector3{ 0.0f, 0.0f, 1.0f } ,IPLVector3{ 1.0f, 0.0f, 0.0f } }; // the listener's coordinate system for (auto i = 0; i < numframes; ++i) { IPLAmbisonicsEncodeEffectParams params{}; params.direction = IPLVector3{ 5.0f, 5.0f, 5.0f }; params.order = 3; IPLAudioEffectState ret = iplAmbisonicsEncodeEffectApply(effect, ¶ms, &inBuffer, &outBuffer); IPLAmbisonicsDecodeEffectParams dparams{}; dparams.order = 3; dparams.hrtf = hrtf; dparams.orientation.right = IPLVector3{ 1.0f, 0.0f, 0.0f }; dparams.orientation.up = IPLVector3{ 0.0f, 1.0f, 0.0f }; dparams.orientation.ahead = IPLVector3{ 0.0f, 0.0f, 1.0f }; dparams.binaural = IPL_TRUE; IPLAudioEffectState ret3 = iplAmbisonicsDecodeEffectApply(deffect, &dparams, &outBuffer, &doutBuffer); iplAudioBufferInterleave(context, &doutBuffer, outputaudioframe.data()); std::copy(std::begin(outputaudioframe), std::end(outputaudioframe), std::back_inserter(outputaudio)); inData[0] += audioSettings.frameSize; } save_output_audio("Ambisonicoutputaudio001.raw", outputaudio); iplAudioBufferFree(context, &doutBuffer); iplHRTFRelease(&hrtf); iplContextRelease(&context); for (auto i = 0; i < 30000; ++i) { if (std::isnan(outputaudio[i])) { std::cout << "值等于 NaN" << std::endl; } } save_output_audio("outputaudioA.raw", outputaudio); return 0; } ```
lakulish commented 1 year ago

I think the Ambisonics decode effect needs to be configured with the correct speaker layout, otherwise it defaults to panning to a single channel. Just add the line:

deffectSettings.speakerLayout.type = IPL_SPEAKERLAYOUTTYPE_STEREO;

before calling iplAmbisonicsDecodeEffectCreate. Let me know if the issue persists after this change.

WHUfreeway commented 1 year ago

thanks for you suggestion, the output audio looks different but iplAmbisonicsDecodeEffectApply still IPL_AUDIOEFFECTSTATE_TAILREMAINING image here is outpupt audio image after db -60 image i think it a audio present space information

lakulish commented 1 year ago

The volume level of the output is definitely not correct. Can you try changing the following line:

params.direction = IPLVector3{ 5.0f, 5.0f, 5.0f };

to a normalized direction vector:

params.direction = IPLVector3{ 1.0f / sqrtf(3.0f), 1.0f / sqrtf(3.0f), 1.0f / sqrtf(3.0f) };

Let me know if this helps.

As for the IPL_AUDIOEFFECTSTATE_TAILREMAINING return value, this is normal and not a cause for concern.

WHUfreeway commented 1 year ago

Thank you,the program seems to be working fine. I still have one more question that if steam audio can only accept the input audio bit depth float32 or it can support more kind of bit depth like PCM16?

lakulish commented 1 year ago

Steam Audio only accepts float32 audio buffers. Conversion from other sample formats (like int16) is left to the caller.

By the way, since normalizing the direction vector passed to iplAmbisonicsEncodeEffectApply caused the output volume level to change, we will treat that as a bug and fix it in the next release.

WHUfreeway commented 1 year ago

Sure, this issue occured when using customed hrtf, after setting float volume=1.0f in IPLHRTFSettings it turns great. I think the default value may be wrong set to 1000.0f or some large number like this.

lakulish commented 11 months ago

We have just released v4.3.0 (https://github.com/ValveSoftware/steam-audio/releases/tag/v4.3.0), which fixes the issue with vector length causing a volume difference in the Ambisonics encode effect.