FDH2 / UxPlay

AirPlay Unix mirroring server
GNU General Public License v3.0
1.34k stars 72 forks source link

Update audio_renderer_gstreamer.c change the volume conversion algorithm #250

Closed soul916 closed 6 months ago

soul916 commented 6 months ago

The airplay volume is a float value, it goes from –30 to 0. The relative proportion of this value seems to be consistent with the percentage of the volume bar, but the volume is strange after Gstreamer volume set, so its square value is used.

thiccaxe commented 6 months ago

Why is squaring needed? Does simply taking the absolute value not work?

soul916 commented 6 months ago

Why is squaring needed? Does simply taking the absolute value not work?

Because without squaring, the sound is louder than normal at low volume.

rogerbinns commented 6 months ago

It sounds like the airplay volume is in decibels while gstreamer is linear. The correct formula is linear = 10 ** (0.1 * decibels) where ** is to the power.

fduncanh commented 6 months ago

@soul916 @rogerbinns @thiccaxe Thanks for looking in to this!

https://gstreamer.freedesktop.org/documentation/audio/gststreamvolume.html?gi-language=c

looks like something using gst_stream_volume_convert_volume is the right way to fix things?

Enumerations GstStreamVolumeFormat

Different representations of a stream volume. gst_stream_volume_convert_volume allows to convert between the different representations.

Formulas to convert from a linear to a cubic or dB volume are cbrt(val) and 20 * log10 (val). Members GST_STREAM_VOLUME_FORMAT_LINEAR (0) – Linear scale factor, 1.0 = 100%

GST_STREAM_VOLUME_FORMAT_CUBIC (1) – Cubic volume scale

GST_STREAM_VOLUME_FORMAT_DB (2) – Logarithmic volume scale (dB, amplitude not power)

soul916 commented 6 months ago

It sounds like the airplay volume is in decibels while gstreamer is linear. The correct formula is linear = 10 ** (0.1 * decibels) where ** is to the power.

The volume value of airplay is indeed like decibels. Before using the current method, I have tried to use exponential functions with bases 2, 10 and e, but I didn't find a comfortable conversion algorithm.

soul916 commented 6 months ago

@soul916 @rogerbinns @thiccaxe Thanks for looking in to this!

https://gstreamer.freedesktop.org/documentation/audio/gststreamvolume.html?gi-language=c

looks like something using gst_stream_volume_convert_volume is the right way to fix things?

Enumerations GstStreamVolumeFormat Different representations of a stream volume. gst_stream_volume_convert_volume allows to convert between the different representations. Formulas to convert from a linear to a cubic or dB volume are cbrt(val) and 20 * log10 (val). Members GST_STREAM_VOLUME_FORMAT_LINEAR (0) – Linear scale factor, 1.0 = 100% GST_STREAM_VOLUME_FORMAT_CUBIC (1) – Cubic volume scale GST_STREAM_VOLUME_FORMAT_DB (2) – Logarithmic volume scale (dB, amplitude not power)

I've read the relevant codes before made this change, and I don't think the volume value of airplay matches any volume type in Gstreamer.

fduncanh commented 6 months ago

Lets check what other implementations such as pyatv or shairport-sync know about AirPlay volume format.

EDIT: this is what seems to be known: values are decibels, range [-30:0], -144 when muted. https://openairplay.github.io/airplay-spec/audio/volume_control.html

Audio volume can be changed using a SET_PARAMETER request. The volume is a float value representing the audio attenuation in dB. A value of –144 means the audio is muted. Then it goes from –30 to 0.

Gstreamer conversion formula is: "Formula to convert from a linear to a dB volume is 20 * log10 (val)."

If we go with this, what changes are needed?

I guess it is vol(gstreamer-linear) = 10^(x/20) in range [10^{-3/2}:1] = or about [0.03 : 1]

Applications can use this interface to get or set the current stream volume. For this the "volume" GObject property can be used or the helper functions gst_stream_volume_set_volume and gst_stream_volume_get_volume. This volume is always a linear factor, i.e. 0.0 is muted 1.0 is 100%. For showing the volume in a GUI it might make sense to convert it to a different format by using gst_stream_volume_convert_volume. Volume sliders should usually use a cubic volume. Separate from the volume the stream can also be muted by the "mute" GObject property or gst_stream_volume_set_mute and gst_stream_volume_get_mute. Elements that provide some kind of stream volume should implement the "volume" and "mute" GObject properties and handle setting and getting of them properly. The volume property is defined to be a linear volume factor.

https://gstreamer.freedesktop.org/documentation/audio/gststreamvolume.html?gi-language=c

fduncanh commented 6 months ago

Yes, the current volume formula (I think by antimov) doesnt seem correct @soul916 Can you test this one to see if you like it? This implements what seems to be the "correct (?)" formula

void audio_renderer_set_volume(float volume) {                                                                                                                        
    float avol=powf(10, volume/20);                                                                                                                                   
    g_object_set(renderer->volume, "volume", avol, NULL);                                                                                                             
} 
thiccaxe commented 6 months ago

Whichever formula works, we might want a check for when, say, 0 or -30 to completely mute the playback, as there could be some weird round off artifacts

fduncanh commented 6 months ago

Airplay "vol" varies beween -30db and 0 as one slides the client's volume (easy to check in the uxplay.cpp volume callback) I think the formula 10**(v/20) is still good at v= -30.00001 or 0.00001 although I dont think they can occur.

EDIT anyway, @thiccaxe is right:

void audio_renderer_set_volume(float volume) { 
    float avol;
    if (volume < -30) {
        avol = 0;
    } else if (volume > 0) {
        avol = 1;
    } else {                                                                                                                       
        avol=powf(10, volume/20);
    }                                                                                                                                   
    g_object_set(renderer->volume, "volume", avol, NULL);                                                                                                             
} 

I'm not sure where to find a "mute" on the iOS client to test mute (as opposed to stopping the playing)

soul916 commented 6 months ago

@fduncanh I will try your method later, check the value of 10**(-30/20) by calculator, it's about 0.0316, 3% volume will produce audible sound, that mean this algorithm cannot be muted.

The airplay "vol" variable will increase or decrease by 1.875 each time when the iphone volume button is used. Adjusting it from -30 to 0 with the volume button will produce the following volume sequence: -30.000000 -28.125000 -26.250000 -24.375000 -22.500000 -20.625000 -18.750000 -16.875000 -15.000000 -13.125000 -11.250000 -9.375000 -7.500000 -5.625000 -3.750000 -1.875000 0.000000

fduncanh commented 6 months ago

@soul916 I checked on an iPad with the slider volume bar. The values of volume vary continuously between -30 and 0 but continuously (true floats, not -(15/8) x 0,1,2,...,16 as you found on iPhone)

The GStreamer doc says

Volume sliders should usually use a cubic volume.

On a Apple TV, -30 is mute, so the "Unofficial AirPlay protocol" is indeed incorrect in calling the units decibels.

Maybe I'll try to see if a cheap decibel meter can show what the true apple TV is doing

Cubic would be avol= (1 + volume/30); avol = avol*avol*avol

Did you try cube rather than square?

soul916 commented 6 months ago

@fduncanh I have just tried the cubic, as expected, When the value is -28.125, no sound can be heard, and when the volume is low, the sound is smaller than normal.

fduncanh commented 6 months ago

@soul916 I got hold of a sound-level meter, and using an apple tv as server, it seems that the iPad volume slider does have a range of -30 db (left) to 0 right, with -15db in the middle, relative to max volume.

The gstreamer formula mentions amplitude (gain) decibels, not power decibels, so I think the correct form is probably

avol = powf(10, volume/10);

(not volume/20) (the corrected formula is the square of powf(10, volume/20)).

what's your opinion of this formula (can you try it and listen)?

soul916 commented 6 months ago

@fduncanh The following are the values output by the two algorithms to GSTREAMER, with powf(10, volume/10) on the left and the code I submitted on the right. I calculated the following results with PYTHON.

0.001 0.0 0.001539926526059492 0.00390625 0.0023713737056616554 0.015625 0.003651741272548377 0.03515625 0.005623413251903491 0.0625 0.008659643233600654 0.09765625 0.01333521432163324 0.140625 0.02053525026457146 0.19140625 0.03162277660168379 0.25 0.04869675251658631 0.31640625 0.07498942093324558 0.390625 0.11547819846894582 0.47265625 0.1778279410038923 0.5625 0.27384196342643613 0.66015625 0.4216965034285822 0.765625 0.6493816315762113 0.87890625 1.0 1.0

I think the exponential algorithm based on 10 is unreasonable:

  1. Input -30 cannot get output 0.
  2. When the volume is low, the volume increases slowly, and the volume can only reach 1% at the 5th level. It may be necessary to adjust the volume to about 6th level to hear the sound.
  3. The volume of the last three levels suddenly increased from 42% to 100%.
rogerbinns commented 6 months ago

@soul916 that is not how it works. Decibels are a logarithmic/exponential scale. Each 3db increase doubles the energy, and linear = powf(10, decibels/10) is the correct formula to use.

It is possible to manipulate content so that it is perceived as louder - for example this is done by advertisers and CDs. Replay gain can be used to calibrate your content - Apple calls it sound check.

soul916 commented 6 months ago

@rogerbinns I don't think the volume value of airplay is a decibel value, and the volume value of Apple device looks consistent with the proportion of the volume bar.

It is recommended to use real equipment to try listening. According to the linear = powf(10, decibels/10) algorithm, you can only get about 1% volume at the 6th level, which should be small for real equipment.

rogerbinns commented 6 months ago

@soul916 > I don't think the volume value of airplay is a decibel value

You are the only one :-) There is an unofficial airplay spec, and it says decibels too.

soul916 commented 6 months ago

@rogerbinns This volume problem has been bothering me for a long time. I know the submitted algorithm has no theoretical basis and doesn't look normal, but it is the best method I can find at present. I've read many documents I can find and tried all kinds of algorithms I can find before I submit, including those mentioned earlier. After so many actual trials, I think the volume value of airplay is not the normal decibel value, because these methods can't get normal linear volume adjustment result.

Another evidence is the Bluetooth volume adjustment from IOS devices. Through the system log of bluez, the volume adjustment of BT is not an arithmetic progression after it is converted to decibel.

bluez.c:1429: Updating A2DP volume: 127 [0.00 dB] bluez.c:1429: Updating A2DP volume: 119 [-0.93 dB] bluez.c:1429: Updating A2DP volume: 111 [-1.94 dB] bluez.c:1429: Updating A2DP volume: 103 [-3.02 dB] bluez.c:1429: Updating A2DP volume: 95 [-4.18 dB] bluez.c:1429: Updating A2DP volume: 87 [-5.45 dB] bluez.c:1429: Updating A2DP volume: 79 [-6.84 dB] bluez.c:1429: Updating A2DP volume: 71 [-8.38 dB] bluez.c:1429: Updating A2DP volume: 63 [-10.11 dB] bluez.c:1429: Updating A2DP volume: 55 [-12.07 dB] bluez.c:1429: Updating A2DP volume: 47 [-14.34 dB] bluez.c:1429: Updating A2DP volume: 39 [-17.03 dB] bluez.c:1429: Updating A2DP volume: 31 [-20.34 dB] bluez.c:1429: Updating A2DP volume: 23 [-24.65 dB] bluez.c:1429: Updating A2DP volume: 15 [-30.81 dB] bluez.c:1429: Updating A2DP volume: 7 [-41.81 dB] bluez.c:1429: Updating A2DP volume: 0 [-96.00 dB]

thiccaxe commented 6 months ago

What do other airplay servers use?

Likely the only way to get to the bottom of this is measuring real output from an airplay device, such as a gen 3 tv...

soul916 commented 6 months ago

I found that because Apple's volume control method or parameters are not standard, it seems that there are volume problems in the hardware that supports airplay with Apple's official cooperation like below:

https://support1.bluesound.com/hc/en-us/community/posts/360038197514-C658-Airplay-volume

fduncanh commented 6 months ago

@soul916 makes a good case

I have a cheap soundmeter, and it seems an AppleTV 3 seems roughly consistent with a 30db slide range, but I am not sure how to precisely test (use pink noise?)

I posted a question on the shairport-sync site maybe someone there knows more

https://github.com/mikebrady/shairport-sync/discussions/1773

rogerbinns commented 6 months ago

The bluesound posting is not that Apple is doing something other than decibels, but rather that the number of steps is too few to be useful for equipment that supports a wider range of gain levels. uxplay can't fix how many different steps airplay clients have.

@soul916 try the following approach:

Then use this algorithm:

// apply gain in decibel domain
volume += gain;
avol = powf (10, volume / 10);

// apply loudness in linear domain
avol = powf(avol, 1 - (loudness / 10));

// clamp
if (avol > 1) avol = 1;
if (avol < 0) avol = 0;

That will let uxplay be easily set to louder or quieter relative to other content, where gain controls actual volume while loudness controls perceived volume.

soul916 commented 6 months ago

@fduncanh hope an expert can give a reasonable explanation. @rogerbinns I didn't fully understand your method. I tried the values of gain and loudness in the range of -10 to 10, and all the combinations had the problem that -30 input could not get 0. The output values of this method seem to be in the case that the volume increases too slowly at low volume levels, and in many cases, input legal value will lead to output of multiple volume levels as 1.0. I also want to know the recommended range of gain.

BTW I have tried to use airplay from iPhone to output audio to MacBook. I found that both MacBook and iPhone can adjust the volume at the same time. When iPhone adjusts the volume of Airplay, it will not affect the volume setting of MacOS, but it will change the Airplay output volume.

soul916 commented 6 months ago

@rogerbinns @fduncanh I think I understand why the volume value of airplay is not decibel. Please refer to the Bluetooth volume sent earlier. dBFS is used in the volume system of Bluez, -96dB means 0% volume, the first level volume -41.81dB can be converted to 0.81%. If the volume of airplay is compared with that of dBFS, and -30 corresponds to 0%, then the volume of 0.81% should be about -13.06. This means that if the volume of Airplay is regarded as a variation of decibels, the speed of increase volume will be very slow in the low volume range, which is completely different from the result of normal device volume control. Compared with dBFS, I think the volume of Airplay should be a proportional value, not a decibel value, because there should be a large space between the zero value and the first level volume of the decibel system.

rogerbinns commented 6 months ago

-30 should be treated as mute. The unofficial airplay spec says -144 is used as mute, but I couldn't get even iPhone with mute switch to send that. Other than that using the slider (fine grained) and volume buttons (coarse grained) are entirely consistent with dbfs. And that is what uxplay should do by default. You can argue that airplay could do a better job, use a different range etc, and any client update could end up with that happening.

I do accept that some folks will want a different "curve" applied. My gain parameter above shifted the baseline, so a gain of -10 would mean everything is ~8 times "quieter" , while the loudness spread the discrete steps so that 1.875db would be made wider or narrower. It is also important to clarify that the airplay volume is not the actual volume the person ultimately hears - it should be treated the same as the level on on a line input cable to whatever is actually producing the audio.

@soul916 why don't you work out the precise formula / parameters you want for the -30 to 0 range to work out to, show that it is applicable to more than just your setup, and how to document it? That way it could be made an option with guidance on when to use it. (BTW I'm not a maintainer on this project, so this is merely a suggestion)

fduncanh commented 6 months ago

The shairport-syc maintainer mike brady replied to my question there: here is his comment

mikebrady 12 hours ago Maintainer Thanks for the message. There is another slight wrinkle in this -- the range is -30.0 to 0.0 alright, but additionally the value -144.0 means "mute".

The -30.0 to 0.0 range does suggest some kind of logarithmic range, but AFAIK there is no documentary evidence to support this.

Shairport Sync controls hardware or software mixers that have decibel-denominated settings -- it doesn't even try to control mixers without dB-denominated settings.

There is a wide range of mixer attenuation ranges out there, from about 30 dB on cheap USB DACs to well over 100 dB on others. In reality, a range of 70 to 80 dB seems necessary. Shairport Sync allows you to specify the range you wish to use -- it can also augment a limited hardware mixer range with an additional software mixer. (It can also be set to ignore volume control information completely.)

Shairport Sync has three optional transfer functions between the AirPlay range and the dB ranges of the mixers:

The default transfer function is piecewise linear between AirPlay volume and dB mixer volume but with slight changes at the low and high ends of the range to make the volume control a bit more "natural" and responsive. The "flat" transfer function is linear between AirPlay and the dB range of the mixer. The (new) "dasl-tapered" transfer function has the property that moving the AirPlay volume slider through half of its range approximately "halves" the volume. Thus, going from 0.0 to -15.0 reduces the volume by 10dB; from -15.0 to -22.5 drops it by a further 10 db, and so on. There is quite a bit of information on the Internet about the properties of "ideal" volume controls.

A new volume control profile called dasl-tapered has been added in which halving the volume control setting halves the output level. For example, moving the volume slider from full to half reduces the output level by 10dB, which roughly corresponds with a perceived halving of the audio volume level. Moving the volume slider from half to a quarter reduces the output level by a a further 10dB. The tapering rate is slightly modified at the lower end of the range if the device's attenuation range is restricted (less than about 55dB).

To activate the dasl-tapered profile, set the volume_control_profile to "dasl_tapered" in the configuration file and restart Shairport Sync.

Many thanks to David Leibovic, aka dasl-, for this.

fduncanh commented 6 months ago

since the current volume formula seems also to be an "ad-hoc" implementation by someone in the code's past, and a clear "correct" answer does not seem to be emerging ...

... and we are open-source, I'd be agreable to adding something like a -vol = n option to chose between different formulas labeled 1,,2,3,..., maybe adapted to different clients (and documented)

soul916 commented 6 months ago

shairport-syc maintainer

After reading shairport-syc maintainer's answer, I feel that the problem of volume is far more complicated than I expected. I hope that UxPlay can add a good volume control method in the next version. If I can choose from multiple methods, I will check my current algorithm again and submit it, hoping it can be accepted.

rogerbinns commented 6 months ago

@soul916 the dasl code is here and should be fairly easy to apply to uxplay. One really nice property is that it has a minimum and maximum db value which nicely lets you spread the volume over whatever range of outputs you want. I'd be all for this being the only algorithm in uxplay with default range of -30 to 0 and you'd be able to use -90 to 0 to get greater fidelity in the quieter regions.

soul916 commented 6 months ago

@rogerbinns Thanks for the tip.

fduncanh commented 6 months ago

so there would be an option like "uxplay -vol mindb:maxdb" with default value 30:0 ?

(including a minus sign as in -vol -30:0 would be troublesome because "-" is used to recognize start of options, but could be handled specially when -vol is seen, but I would prefer not to deal with it) -vol "-30:0" might work.

rogerbinns commented 6 months ago

Fortunately uxplay is doing its own argument parsing in parse_arguments so you just need a branch like

double minimum_volume = -30;
double maximum_volume = 0;

} else if (arg == "-minvol") {
  if(i <  argc - 1) {
      minimum_volume = strtod(argv[i+1], NULL);
      i++;
     continue;
  }
   .... issue an error about argument to minvol expected ....
} else if (arg == "-maxvol") {
  if(i <  argc - 1) {
      maximum_volume = strtod(argv[i+1], NULL);
      i++;
     continue;
  }
   .... issue an error about argument to maxvol expected ....
} else if ....

More error checking could be done instead of strtod but the above would work fine. And probably less code than trying to tease apart two colon separated floats.

-vol "-30:0" might work

It wouldn't because the shell would strip off the double quotes.

fduncanh commented 6 months ago

Some more info.

  1. I agree with the discrete sequence (push the button) in an iPad -30.000000 -28.125000 -26.250000 -24.375000 -22.500000 -20.625000 -18.750000 -16.875000 -15.000000 -13.125000 -11.250000 -9.375000 -7.500000 -5.625000 -3.750000 -1.875000 0.000000. 16 steps of 15/8 = 30 (The slider seems to allow all values in range [-30, 0] continuously.)

  2. While -30 is indeed "mute", the other values are consistent with SPA (sound pressure amplitude) decibels, which is not power, but amplitude, so the conversion from AirPlay "volume" to to GStreamer linear "avol" is indeed

  1. in SPA decibels 3 dB doubles the power, 6dB doubles the sound-pressure amplitude, 10dB doubles the human-perceived loudness.

Current code: I see in GStreamer docs that avol should be a "gdouble" not a float. This divides the range -28dB: 0 into 28 linear steps from 0 to 1. I don't see why this was done.

void audio_renderer_set_volume(float volume) {
    float avol;
        if (fabs(volume) < 28) {
        avol=floorf(((28-fabs(volume))/28)*10)/10;
            g_object_set(renderer->volume, "volume", avol, NULL);
        }
}

The original RPiPlay code (uses OMX, not GStreamer) doesn't have this formula (which must be due to antimof) and is (with a mysterious factor of 2)

static void audio_renderer_rpi_set_volume(audio_renderer_t *renderer, float volume) {
    audio_renderer_rpi_t *r = (audio_renderer_rpi_t *)renderer;
    OMX_AUDIO_CONFIG_VOLUMETYPE audio_volume;
    memset(&audio_volume, 0, sizeof(audio_volume));
    audio_volume.nSize = sizeof(OMX_AUDIO_CONFIG_VOLUMETYPE);
    audio_volume.nVersion.nVersion = OMX_VERSION;

    audio_volume.bLinear = OMX_FALSE;
    audio_volume.nPortIndex = 100;
    // Factor 100 for dB -> mB (millibel)
    // It's not clear where the additional factor of 2 comes from,
    // but without it, volume is too high.
    audio_volume.sVolume.nValue = volume * 200.0;

    if (OMX_SetConfig(ilclient_get_handle(r->audio_renderer), OMX_IndexConfigAudioVolume,
                      &audio_volume) != OMX_ErrorNone) {
        logger_log(renderer->logger, LOGGER_DEBUG, "Could not set audio volume");
    }
}

* looking at the OMX documentation below ,  RPiPlay (OMX) doubles the Apple volume to get  amplitude dB with a mysterious factor of 2., essentially squaring the attentuation factor.   Maybe this is scaling {-30dB : 0} to {-60dB: 0} ?

The OMX documentation says:

4.1.39.1 Parameter Definitions The parameters for OMX_AUDIO_CONFIG_VOLUMETYPE are defined as follows.

fduncanh commented 6 months ago

@soul916

The uxplay branch "volume" : https://github.com/FDH2/UxPlay/tree/volume

EDIT now in main branch "master"

Contains a "correct" implementation of Airplay volume in the range -30db : 0 (with exactly -30 being mute), plus an option -db low[:high] that scales -30:0 into low:0 or low:high ("flat" decibel rescaling).

I didn't implement the two modifications of this from shairplay-sync. Please see if using "-db x" for some value of x satisfies your needs.

fduncanh commented 6 months ago

@soul916 I now added the "dasl"-style volume-control tapering as an option "-taper" (it can be combined with the -db l[:h] decibel-range scaling).

It's now in the master branch of UxPlay.

EDIT the implementation is now in the set_volume callback in uxplay.cpp, no longer in renderers/audio_renderer_gstreamer.c

Hope it fills your needs.