libsdl-org / SDL_mixer

An audio mixer that supports various file formats for Simple Directmedia Layer.
zlib License
433 stars 146 forks source link

[Proposal] Support Logarithmic Scale Volume Control #433

Open complexlogic opened 2 years ago

complexlogic commented 2 years ago

Thank you for this great library. I've integrated it into my project over the last few days and it's working very well. However, I have a suggestion for improvement in regard to the volume controls. SDL currently implements audio volume control as a simple linear attenuation of the input signal. The problem with this is that it's not an accurate model of how hearing works.

Due to the characteristics of the human ear, there is a logarithmic relationship between the amplitude of an audio signal and its perceived loudness, which is why loudness is typically measured in decibels. The linear scale results in an extremely uneven distribution of loudness. This can be demonstrated with a simple SDL_mixer test program: play a chunk sound effect at MIX_MAX_VOLUME (128), then change the channel volume to half (64) and recompile and play again. You will notice that there is only a very slight reduction in the loudness, despite the volume being halved. Indeed, halving the amplitude of an audio signal reduces its loudness by only 6 dB. The difference is even more apparent when comparing the top 25% and bottom 25% of the scale. The difference between 128 and 96 is only 2.5 dB, but the difference between 35 and 1 is more than 30 dB.

My Implementation

I would like for the users of my application to be able to specify channel volume on a limited scale that is linear with respect to loudness instead of signal amplitude. I designed a wrapper function for Mix_Volume() to convert the requested application-level log volume to the internal SDL linear attenuation factor volume.

There are a couple factors that go into designing a log scale volume control. The first is the dynamic range in dB, which is the volume difference between the loudest signal and the quietest. Since we have 128 volume steps in SDL, the most we can attenuate the signal is 1/128, which is about -42 dB. I rounded this to -40 dB.

Next is the number of volume steps. Ideally, I would like to have a volume scale of 0-100. Unfortunately, with a limited resolution of only 128 steps in SDL, a 0-100 scale cannot produce a unique SDL volume for each step due to rounding, so I decided to settle for a simpler 0-10 scale. In this scale, a value of 10 corresponds to 0 dB (full, unattenuated signal), while 0 corresponds to -40 dB. However, I decided to set 0 to a volume of 0, since most users will interpret 0 as meaning complete silence. Each step in the scale corresponds to a 4 dB adjustment, plus or minus a bit due to rounding. I created a spreadsheet with the calculations which is screenshotted below:

screen29

Here is how I chose to implement this volume control in C++. Since the log and power functions can be computationally expensive, I pre-calculate the SDL volume values at compile time for each step in the scale using a constexpr lambda function, and store the results in a static array. Then, determining the SDL volume at runtime is simply accessing the array at the requested volume step.

I chose to use constexpr lambda strictly for convenience purposes, so I could alter the macro values and recompile without having to manually type the array values from the spreadsheet each time. But this function could be easily adapted to C by doing that instead of the lambda.

#define MAX_VOLUME 10
#define RANGE_DB -40

int Sound::set_volume(int channel, int volume)
{
    if (volume > MAX_VOLUME || volume < 0)
        return 1;

    constexpr auto generate_array = []() {
        double a = pow(10.f, (double) RANGE_DB / 20.f);
        double b = log10(1.f / a) / (double) MAX_VOLUME;
        std::array<int, MAX_VOLUME + 1> arr;
        arr[0] = 0;
        arr[MAX_VOLUME] = MIX_MAX_VOLUME;
        for (int n = 1; n < MAX_VOLUME; n++) {
            arr[n] = (int) std::round(a * pow(10.f, (double) n * b) * (double) MIX_MAX_VOLUME);
        }
        return arr;
    };
    static constexpr std::array<int, MAX_VOLUME + 1> volume_array = generate_array();

    Mix_Volume(channel, volume_array[volume]);
    return 0;
}

I don't support the volume querying here because I don't personally need it, but it would be trivial to implement.

Proposal

My proposal is to add a logarithmic volume control function with a title such as Mix_VolumeLog() or Mix_VolumeDB(), which would be implemented similar to the above. The fundamentals of psychoacoustics are not well understood by many people, so I believe it would provide a very useful abstraction for implementing application-level volume control. Since it uses SDL's current volume controls underneath, it would not break any existing functionality nor the ABI.

slouken commented 2 years ago

This seems like a good idea. Instead of 10 steps, I would probably implement a 0-100 scale so people can think about it as a percentage volume knob.

complexlogic commented 2 years ago

I agree with you that a 0-100 scale would be more desirable than 0-10. The problem that SDL's existing volume controls only allows 128 steps, which isn't enough resolution for a log-based scale with 100 steps because much of the signal gets lost in rounding in the bottom of the scale. I redid the calculations for 0-100 and screenshotted below.

screen30

You can see in the third column the volume goes flat for significant stretches - in other words, there is no difference between a volume setting and the next. This breaks the linearity with respect to loudness that I desired in the first place.

To have a unique volume for each step on a 0-100 log scale we would need to increase MIX_MAX_VOLUME to allow for a finer resolution. I presume that is not allowable currently because it could break existing functionality, but perhaps something to consider for SDL3.