kcat / openal-soft

OpenAL Soft is a software implementation of the OpenAL 3D audio API.
Other
2.12k stars 521 forks source link

[Suggestion] Could we disable some extensions to produce a more cleanup and lightweight build? #424

Open taigacon opened 4 years ago

taigacon commented 4 years ago

Should we introduce some build conditions to disable/enable the specific modules/extensions such as ALSOFT_BUILD_DISABLE_EFFECTS, ALSOFT_BUILD_DISABLE_HRTF to produce a more cleanup and lightweight build? It is sometimes necessary for some embed platforms or for the users that want to use core OpenAL functions only.

kcat commented 4 years ago

I'm curious what savings there would be with disabling particular extensions/functionality. A stripped release build of current master on Linux is only 925KB (1.1MB for an unstripped release build), with 79KB being the built-in HRTF data set (which can be excluded with ALSOFT_EMBED_HRTF_DATA=FALSE). A MinSizeRel build would likely be even smaller. What size are you hoping to reach?

For a more lightweight build, it'd be necessary to identify what parts are "heavy" and worth removing to save on. If disabling HRTF at build time only results in saving a few KB of compiled code, for example, that doesn't seem worth it to me. But if it's significantly more and you really need to that space for a particular target... maybe. There would need to be a practical benefit to disabling something at build time.

taigacon commented 4 years ago

The stripped release build on Android with NDK-r20 is 1150KB ( ALSOFT_EMBED_HRTF_DATA=true) After run "nm -S --size-sort -t d libopenal.so | tail -n 20", we can see:

00345128 00004424 t _Z13EnumerateHrtfPKc 00268016 00004852 t _ZN12_GLOBALN_118LoadConfigFromFileERNSt6ndk113basic_istreamIcNS0_11char_traitsIcEEEE 00262832 00005184 t _Z12ReadALConfigv 00451452 00005568 t _ZN12_GLOBALN_111ReverbState7processEjN2al4spanIKNSt6ndk15arrayIfLj1024EEELj4294967295EEENS2_IS5_Lj4294967295EEE 00254684 00005672 t _ZN12_GLOBALN_114alc_initconfigEv 00112328 00006940 t _ZN12_GLOBALN_111SetSourceivEP8ALsourceP10ALCcontextNS_10SourcePropEN2al4spanIKiLj4294967295EEE 00297980 00007048 t _ZN12_GLOBALN_121CalcPanningAndFiltersEP7ALvoicefffffRKNS_11GainTripletEN2al4spanIS3_Lj16EEERA16_P12ALeffectslotPK16ALvoicePropsBaseRK10ALlistenerPK9ALCdevice 01189616 00008192 b _ZN12_GLOBALN_110HannWindowE 01181408 00008192 b _ZN12_GLOBALN_110HannWindowE 00378668 00008976 t Z15aluInitRendererP9ALCdevicei15HrtfRequestModeS1 00305028 00009980 t _ZN10AmbDecConf4loadEPKc 00286192 00011336 t _Z10aluMixDataP9ALCdevicePvjj 00216692 00012840 t _ZL18UpdateDeviceParamsP9ALCdevicePKi 00150596 00015316 t _ZNSt6__ndk116allocator_traitsIN2al9allocatorI7ALvoiceLj16EEEE9constructIS3_JEEEvRS4_PTDpOT0 00166000 00015724 t ZN7ALvoiceC2EOS 00571096 00015820 r _ZL10reverblist 00352040 00017960 t _Z13GetLoadedHrtfRKNSt6ndk112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEEPKcj 01078222 00080354 r _ZN12_GLOBALN_112hrtf_defaultE 00589904 00159744 r _ZN12_GLOBALN_111bsinc12_tabE 00749648 00327680 r _ZN12_GLOBAL__N_111bsinc24_tabE

B-format and HRTF take the most.

kcat commented 4 years ago

It looks like the bsinc tables (for the resampler) are the biggest offenders, totaling nearly 500K. Disabling that would cut the size almost in half, and make the cubic resampler the best option. Followed by the built-in HRTF, about 80K, which can already be disabled.

The rest is less than 20K each. GetLoadedHrtf can probably be shrunk by disabling support for older HRTF data formats, and reverblist could be removed (it's only needed for the ALSOFT_DEFAULT_REVERB env var to auto-apply a reverb when the app doesn't).

Interestingly, current master has the bsinc tables in the bss data section (which makes sense since they're computed at load time instead of pregenerated at build time), but it still takes up disk space. I always thought the bss data section didn't take up any disk space since it's anonymous memory pages. It just needs a size and base location that gets allocated at load time and symbols reference their place in it, initializing as appropriate.

taigacon commented 4 years ago

I am sorry for my previous wrong description. I used the official release 1.20.1, and the bsinc tables are in the data section at that time. I'd try the master branch later.

taigacon commented 4 years ago

The final size of the Android armeabi-v7a build reduces to 681.92KB. But I think it is necessary to disable some functionality on specific platforms, for example, ALConfig (or even getenv and file IO functionality ) is useless on Android/iOS, and things are even worse on Nintendo Switch, PS4 or WP because opening file from the relative path or incorrect absolute path is prohibited and could cause a runtime crash. The 2D game developers may not care about effects, filters, and HRTF as well.

Here are some analysis results for reference only:

$ nm -S --size-sort -t d libopenal.so | awk '{if($3=="t"||$3=="r"){print}}' | tail -n 20 00329912 00002884 t _ZN15DirectHrtfState5buildEPK9HrtfStoreN2al4spanIK12AngularPointLj4294967295EEEPA16_KfNS4_IS8_Lj4EEE 00446844 00002948 t _ZN12_GLOBALN_111ReverbState6updateEPK10ALCcontextPK12ALeffectslotPK11EffectProps12EffectTarget 00135328 00003292 t _ZN12_GLOBALN_111GetSourcedvEP8ALsourceP10ALCcontextNS_10SourcePropEN2al4spanIdLj4294967295EEE 00116804 00003740 t _ZN12_GLOBALN_111SetSourcefvEP8ALsourceP10ALCcontextNS_10SourcePropEN2al4spanIKfLj4294967295EEE 00402748 00004372 t _ZN12_GLOBALN_113OpenSLCapture4openEPKc 00332796 00004424 t _Z13EnumerateHrtfPKc 00388324 00004440 t _ZN5Voice3mixENS_5StateEP10ALCcontextj 00254336 00004852 t _ZN12_GLOBALN_118LoadConfigFromFileERNSt6ndk113basic_istreamIcNS0_11char_traitsIcEEEE 00249152 00005184 t _Z12ReadALConfigv 00449792 00005908 t _ZN12_GLOBALN_111ReverbState7processEjN2al4spanIKNSt6ndk15arrayIfLj1024EEELj4294967295EEENS2_IS5_Lj4294967295EEE 00239932 00005992 t _ZN12_GLOBALN_114alc_initconfigEv 00124172 00006684 t _ZN12_GLOBALN_111SetSourceivEP8ALsourceP10ALCcontextNS_10SourcePropEN2al4spanIKiLj4294967295EEE 00284332 00007168 t _ZN12_GLOBALN_121CalcPanningAndFiltersEP5VoicefffffRKNS_11GainTripletEN2al4spanIS3_Lj6EEERA6_P12ALeffectslotPK10VoicePropsRK10ALlistenerPK9ALCdevice 00370580 00009276 t Z15aluInitRendererP9ALCdevicei15HrtfRequestModeS1 00291500 00009980 t _ZN10AmbDecConf4loadEPKc 00272160 00011704 t _Z10aluMixDataP9ALCdevicePvjj 00201952 00012512 t _ZL18UpdateDeviceParamsP9ALCdevicePKi 00577072 00015820 r _ZL10reverblist 00339708 00022204 t _Z13GetLoadedHrtfRKNSt6ndk112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEEPKcj 00597198 00080353 r _ZN12_GLOBAL__N_112hrtf_defaultE

$ nm -S --size-sort -t d libopenal.so | awk '{if($3=="t"||$3=="r"){print}}' | grep -i config | awk '{sum+=$2} END {print sum}' 19652

$ nm -S --size-sort -t d libopenal.so | awk '{if($3=="t"||$3=="r"){print}}' | grep -i effect | awk '{sum+=$2} END {print sum}' 42478

$ nm -S --size-sort -t d libopenal.so | awk '{if($3=="t"||$3=="r"){print}}' | grep -i voice | awk '{sum+=$2} END {print sum}' 20224

taigacon commented 4 years ago

The win32 build is 670KB for /MD and 1030KB for /MT with the latest VS2019. As a matter of experience, i/o stream and locale modules will take about 200+KB when static linking vc++ runtime. Removing the usage of i/o stream or use a custom string stream implementation with std::string and std::to_string will benefit a lot.

kcat commented 4 years ago

for example, ALConfig (or even getenv and file IO functionality ) is useless on Android/iOS

It's not necessarily useless on Android. It is possible to get access to a shell interface and run apps from there, with env vars. In some cases, the env vars may even be useful for compatibility (an app can launch from a shell script or some other trampoline, which sets __ALSOFT_SUSPEND_CONTEXT or __ALSOFT_HALF_ANGLE_CONES or something before running the app proper).

even worse on Nintendo Switch, PS4 or WP because opening file from the relative path or incorrect absolute path is prohibited and could cause a runtime crash.

Can't say I've heard of it causing a crash. The lib will gracefully handle not being able to open files, and continue on without it. The only thing that may need a bit of work is if /proc/cpuinfo can't be opened, as another method to test for Neon support would be needed instead of assuming no support (ARM doesn't have the cpuid instruction like x86, to test available features).

Removing the usage of i/o stream or use a custom string stream implementation with std::string and std::to_string will benefit a lot.

I've never been a fan of i/o streams. But it does provide a standard overridable interface to handle custom/non-file sources. Using a custom stream type to avoid a bloated implementation being static-linked would be nice, if there's any way to help with that.

taigacon commented 4 years ago

Can't say I've heard of it causing a crash. The lib will gracefully handle not being able to open files, and continue on without it. The only thing that may need a bit of work is if /proc/cpuinfo can't be opened, as another method to test for Neon support would be needed instead of assuming no support (ARM doesn't have the cpuid instruction like x86, to test available features).

Due to NDA I can't say anything more, however, the c-runtime of some console platforms are just a clue layer of their own SDK or system library. For example, Nintendo SDK force developer to use absolute path only, and the form of filesystem path is like 'rom:/dir/file.jpg'. The cwd path of c-runtime is "" when loading the library. So use relative path or the wrong absolute path like '/proc/cpuinfo' will cause using the invalid filesystem path and lead to crash (Yes, by design).

I think it is the responsibility of the porter, however, if we provide the functionality of disabling the module, things may become simple.

I've gotten carried away. I still think that providing some build-time configuration will be great for some tightfisted ones, like me:)

taigacon commented 4 years ago

I've never been a fan of i/o streams. But it does provide a standard overridable interface to handle custom/non-file sources. Using a custom stream type to avoid a bloated implementation being static-linked would be nice, if there's any way to help with that.

I can try that.

kcat commented 4 years ago

Well, I can add a build option to avoid any disk i/o, along with the related functionality that only has use with disk input. Though I still need to find a way to detect when an ARM device supports Neon (using /proc/cpuinfo isn't something I like either, but I've yet to find another method that's any better supported).

taigacon commented 4 years ago

According to https://github.com/magnumripper/JohnTheRipper/issues/1998, __ARM_NEON is defined when the compiler has already specified the neon support. We should check it first. Then we may refer OpenSSL's runtime detect solution, because it is actually the best solution in general.

VioletGiraffe commented 3 years ago

I second this, I've spent a lot of time hacking the OpenAL source (literally), trying to get rid of extra unwanted processing (including, but not limited to, HRTF).

BTW, the recent versions of OpenAL have VERY serious device-specific problems on Android, are you interested? A version from about 3 years ago had no such problems, however, I can no longer use it since it crashes when compiled for ARM64 (and Google Play no longer allows releasing 32-bit armv-v7a executables).

kcat commented 3 years ago

I second this, I've spent a lot of time hacking the OpenAL source (literally), trying to get rid of extra unwanted processing (including, but not limited to, HRTF).

If you turn off HRTF with either the config file (for your system) or with the extension (as an app option), it should have no processing impact. Same with the output limiter. Most extensions are just exposing options the renderer internally has, and removing them won't have much, if any, performance impact compared to what the extensions allow. At least without rewriting various parts of the library which will increase maintenance cost if not also change the audio quality. What else are you wanting to disable?

BTW, the recent versions of OpenAL have VERY serious device-specific problems on Android, are you interested? A version from about 3 years ago had no such problems, however, I can no longer use it since it crashes when compiled for ARM64 (and Google Play no longer allows releasing 32-bit armv-v7a executables).

Yes, please. If the library crashes or is otherwise unusable, it's something that would need fixing. Please create a new issue for it.

VioletGiraffe commented 3 years ago

My personal concern is not about performance, and I now realize it's probably not about extensions, the problem I encountered is likely in the core codebase: OpenAL is altering the stereo sound I generate and I can't find a way to prevent that. For example, when I generate a perfectly out of phase stereo sine wave, OpenAL makes either into silence (in both channels). Can you tell what feature is responsible for this and how i can turn it off?

I traced the issue down to this line in mixer.c: DryBuffer[OutPos][c] += value*DrySend[c] Since my signal is perfectly out of phase, this amounts to 0 += LeftChannelSampleValue += RightChannelSampleValue, which is always zero by definition. How is this even supposed to work?

Sorry for somewhat off-the-topic questions.

kcat commented 3 years ago

My personal concern is not about performance, and I now realize it's probably not about extensions, the problem I encountered is likely in the core codebase: OpenAL is altering the stereo sound I generate and I can't find a way to prevent that. For example, when I generate a perfectly out of phase stereo sine wave, OpenAL makes either into silence (in both channels). Can you tell what feature is responsible for this and how i can turn it off?

Depending on your output setup, it's probably the spatialization. HRTF filters the sound to make it seem as though each input buffer channel comes from a point in 3D space, so two out-of-phase signals near each other will interfere with each other and attenuate, just like in real-life. With surround sound speakers, each input buffer channel is mixed such that it uses all speakers to focus the sound in the intended direction (simple amplitude panning between the two nearest speakers doesn't work good with surround sound), so there could be similar interference there.

You can avoid spatialization by using the AL_SOFT_direct_channels(_remix) extension, which puts the left input buffer channel directly on the left output channel if it exists, and the same for the right channel, without doing any position panning. Wherever the left and right speakers are around the user is where the left and right buffer channels will play from, with no regard for the output setup (someone using headphones won't get the same effect as someone using speakers, but if you have a premixed binaural sound you want to play over headphones, that's how you'd do it).

Alternatively, you can use the AL_EXT_STEREO_ANGLES extension to widen the distance between the left and right buffer channels, which would help reduce interference while keeping spatialization.

DryBuffer[OutPos][c] += value*DrySend[c] Since my signal is perfectly out of phase, this amounts to 0 += LeftChannelSampleValue += RightChannelSampleValue, which is always zero by definition. How is this event supposed to work?

For normal panning, each input buffer channel has its own set of DrySend values. An exact pan of a stereo sound would have, for the left input buffer channel: DrySend = {1, 0, 0, 0, ...} and for the right input buffer channel: DrySend = {0, 1, 0, 0, ...}

With c being the output channel index, that results in the left input buffer mix being:

DryBuffer[OutPos][0] += LeftChannelSampleValue*1
DryBuffer[OutPos][1] += LeftChannelSampleValue*0

and the right input buffer mix being:

DryBuffer[OutPos][0] += RightChannelSampleValue*0
DryBuffer[OutPos][1] += RightChannelSampleValue*1

Modern versions of OpenAL Soft are a little more involved than this (the DryBuffer uses B-Format as an intermediary instead of discrete channels), but essentially gets the same results at the end.

VioletGiraffe commented 3 years ago

Thank you very much for the detailed answer, I really appreciate it. Disabling HRTF was the first thing I did and it didn't help. It sounds like AL_SOFT_direct_channels is exactly what I need, going to look for how to enable it. Hopefully, with it I will be able to use the release version of OpenAL as-is, without having to hack parts out of mixer.c.