signalwire / freeswitch

FreeSWITCH is a Software Defined Telecom Stack enabling the digital transformation from proprietary telecom switches to a versatile software implementation that runs on any commodity hardware. From a Raspberry PI to a multi-core server, FreeSWITCH can unlock the telecommunications potential of any device.
https://freeswitch.com/#getting-started
Other
3.64k stars 1.44k forks source link

RTP Media Timeout is broken for calls on hold #1947

Open shaunjstokes opened 1 year ago

shaunjstokes commented 1 year ago

If both 'media_hold_timeout_audio' and 'media_timeout_audio' are set, and 'media_timeout_audio' has a value less than 'media_hold_timeout_audio', then 'media_timeout_audio' is used instead of 'media_hold_timeout_audio' while the call is on hold.

This is a bug that's been introduced in 1.10.8 (and master), as a result of the following changes. src/switch_core_media.c

        if (engine->type == SWITCH_MEDIA_TYPE_AUDIO) {
            /* the values are in milliseconds, not in seconds as the deprecated rtp_timeout_sec */
            engine->max_missed_packets = (engine->read_impl.samples_per_second * engine->media_timeout / 1000) / engine->read_impl.samples_per_packet;

            switch_rtp_set_max_missed_packets(engine->rtp_session, engine->max_missed_packets);
            if (!engine->media_hold_timeout) {
                engine->media_hold_timeout = engine->media_timeout * 10;
            }

            engine->max_missed_hold_packets = (engine->read_impl.samples_per_second * engine->media_hold_timeout / 1000) / engine->read_impl.samples_per_packet;
        }

In this commit: https://github.com/signalwire/freeswitch/commit/beffab1d680104f2c469d0947a2a8b0e09d9c343

Reversing the above changes resolves the issue.

@dragos-oancea @andywolk can you please comment on the above, would you like me to create a pull request to roll this back?

dragos-oancea commented 1 year ago

Those changes are validated by an unit-test in the same commit (test_rtp_media_timeout) . I don't see a bug in the code snippet you shown, it's just us setting some vars. If you have a fix for the bug you've seen, propose it. We're supposed to go forward, not backwards.

shaunjstokes commented 1 year ago

The bug is simple to reproduce. 1) Dialplan action to export the following. media_timeout_audio=300000 media_hold_timeout_audio=1800000 2) Place a call from a local extension on FS to an external bridged destination. 3) Place the call on the local extension on hold via the device (i.e. Polycom VVX). 4) Verify that the device is not transmitting RTP to FS (using Wireshark or similar). 5) Observe that RTP Media Timeout is triggered by FS after 300000ms when it should be triggered after 1800000ms.

If we remove the above changes and repeat the tests as follows, there are no further issues.

Test call on hold. 1) Dialplan action to export the following. media_timeout_audio=300000 media_hold_timeout_audio=1800000 2) Place a call from a local extension on FS to an external bridged destination. 3) Place the call on the local extension on hold via the device (i.e. Polycom VVX). 4) Verify that the device is not transmitting RTP to FS (using Wireshark or similar). 5) Observe that RTP Media Timeout is triggered by FS after 1800000ms.

Test call with-out hold. 1) Dialplan action to export the following. media_timeout_audio=300000 media_hold_timeout_audio=1800000 2) Place a call from a local extension on FS to an external bridged destination. 3) Block RTP traffic between the local extension device and FS. 4) Verify that the device is not transmitting RTP to FS (using Wireshark or similar). 5) Observe that RTP Media Timeout is triggered by FS after 300000ms.

I'm not sure why that change causes FS to not use the correct media timeout variables, but removing it fixes the problem.

shaunjstokes commented 1 year ago

@dragos-oancea what's the purpose of that change, as far as I can tell media timeout works correctly with-out it? If it's to set 'media_hold_timeout' when not set, then perhaps this would be a better approach?

if (!engine->media_hold_timeout && engine->media_timeout) {
    engine->media_hold_timeout = engine->media_timeout * 10;
}
dragos-oancea commented 1 year ago

@shaunjstokes , can you please try https://github.com/signalwire/freeswitch/pull/1949 ?

shaunjstokes commented 1 year ago

@dragos-oancea unfortunately the problem is the same following that patch.

admin-toneca commented 6 months ago

This bug also reproduces in 1.10.9.