signalwire / freeswitch

FreeSWITCH is a Software Defined Telecom Stack enabling the digital transformation from proprietary telecom switches to a versatile software implementation that runs on any commodity hardware. From a Raspberry PI to a multi-core server, FreeSWITCH can unlock the telecommunications potential of any device.
https://freeswitch.com/#getting-started
Other
3.48k stars 1.4k forks source link

FIFO Queue Calls flip flop between members after rejoining the queue #1855

Open mbrooks opened 1 year ago

mbrooks commented 1 year ago

When using mod_fifo, calls to members filp-flop between members after rejoining the queue even when call lag is set to zero.

To Reproduce Created dialplan/public/cool_fifo.xml:

<include>
        <extension name="Queue_Call_In">
          <condition field="destination_number" expression="^7011$">
            <action application="set" data="fifo_music=$${hold_music}"/>
            <action application="answer"/>
            <action application="fifo" data="myq in"/>
          </condition>
        </extension>
</include>

Modified autoload_configs/fifo.conf.xml

<configuration name="fifo.conf" description="FIFO Configuration">
  <settings>
    <param name="delete-all-outbound-member-on-startup" value="true"/>
  </settings>
  <fifos>
    <fifo name="myq" importance="1" outbound_per_cycle="3">
        <member timeout="12" simo="1" lag="0">{member_wait=nowait}user/1003@$${domain}</member>
        <member timeout="12" simo="1" lag="0">{member_wait=nowait}user/1004@$${domain}</member>
        <member timeout="12" simo="1" lag="0">{member_wait=nowait}user/1005@$${domain}</member>
    </fifo>
  </fifos>
</configuration>

Steps to reproduce the behavior:

  1. Customer 1 calls into the queue
  2. Member 1 (1004) answers the call
  3. Customer 2 calls into the queue
  4. Member 1 (1004) hangs up the call with the Customer 1
  5. Calls are sent to members in an alternating state rather than all at the same time.

Expected behavior When lag is set to 0, we expect that all available members should be called after each new call cycle starts. Instead, what happens is FreeSwitch reverts to something more like Round Robin and alternates ringing between the newly available members and the agents it was calling before. This leaves us in a flip flop state in regard to when phones call called.

Package version or git hash

Other notes I’ve patched FreeSwitch to remove the lag + 1 second delay and calls now ring all members as expected. So this is more than likely the issue. That being said, there is probably a reason for the lag + 1 being added here, and I don't know what bad things would happen if we remove this sanity check.

My Patch:

diff --git a/src/mod/applications/mod_fifo/mod_fifo.c b/src/mod/applications/mod_fifo/mod_fifo.c
index 724d1dd24f..c332766270 100644
--- a/src/mod/applications/mod_fifo/mod_fifo.c
+++ b/src/mod/applications/mod_fifo/mod_fifo.c
@@ -1670,7 +1670,7 @@ static void *SWITCH_THREAD_FUNC outbound_ringall_thread_run(switch_thread_t *thr
                                        char *sql = switch_mprintf("update fifo_outbound set ring_count=ring_count-1, "
                                                                                           "outbound_fail_count=outbound_fail_count+1, "
                                                                                           "outbound_fail_total_count = outbound_fail_total_count+1, "
-                                                                                          "next_avail=%ld + lag + 1 where uuid='%q' and ring_count > 0",
+                                                                                          "next_avail=%ld + lag where uuid='%q' and ring_count > 0",
                                                                                           (long) switch_epoch_time_now(NULL) + node->retry_delay, h->uuid);
                                        fifo_execute_sql_queued(&sql, SWITCH_TRUE, SWITCH_TRUE);
                                }
@@ -1850,7 +1850,7 @@ static void *SWITCH_THREAD_FUNC outbound_enterprise_thread_run(switch_thread_t *

        if (status != SWITCH_STATUS_SUCCESS) {
                sql = switch_mprintf("update fifo_outbound set ring_count=ring_count-1, "
-                                                        "outbound_fail_count=outbound_fail_count+1, next_avail=%ld + lag + 1 where uuid='%q'",
+                                                        "outbound_fail_count=outbound_fail_count+1, next_avail=%ld + lag where uuid='%q'",
                                                         (long) switch_epoch_time_now(NULL) + (node ? node->retry_delay : 0), h->uuid);
                fifo_execute_sql_queued(&sql, SWITCH_TRUE, SWITCH_TRUE);

@@ -2400,7 +2400,7 @@ static void dec_use_count(switch_core_session_t *session, const char *type)
                fifo_execute_sql_queued(&sql, SWITCH_TRUE, SWITCH_FALSE);

                del_bridge_call(outbound_id);
-               sql = switch_mprintf("update fifo_outbound set use_count=use_count-1, stop_time=%ld, next_avail=%ld + lag + 1 where use_count > 0 and uuid='%q'",
+               sql = switch_mprintf("update fifo_outbound set use_count=use_count-1, stop_time=%ld, next_avail=%ld + lag where use_count > 0 and uuid='%q'",
                                                         now, now, outbound_id);
                fifo_execute_sql_queued(&sql, SWITCH_TRUE, SWITCH_TRUE);
                fifo_dec_use_count(outbound_id);
@@ -3450,7 +3450,7 @@ SWITCH_STANDARD_APP(fifo_function)

                                        sql = switch_mprintf("update fifo_outbound set stop_time=%ld, use_count=use_count-1, "
                                                                                 "outbound_call_total_count=outbound_call_total_count+1, "
-                                                                                "outbound_call_count=outbound_call_count+1, next_avail=%ld + lag + 1 where uuid='%q' and use_count > 0",
+                                                                                "outbound_call_count=outbound_call_count+1, next_avail=%ld + lag where uuid='%q' and use_count > 0",
                                                                                 now, now, outbound_id);

                                        fifo_execute_sql_queued(&sql, SWITCH_TRUE, SWITCH_TRUE);
netpro25 commented 7 months ago

@mbrooks Did the patch end up working for you in the long run? We are having a similar issue and it looks like this issue is not being addressed.