OpenSIPS / opensips

OpenSIPS is a GPL implementation of a multi-functionality SIP Server that targets to deliver a high-level technical solution (performance, security and quality) to be used in professional SIP server platforms.
https://opensips.org
Other
1.26k stars 577 forks source link

[BUG] CANCEL does not terminate transaction in b2b mode with clusterer #3010

Closed denyspozniak closed 1 year ago

denyspozniak commented 1 year ago

OpenSIPS version you are running

# opensips -V
version: opensips 3.3.3 (x86_64/linux)
flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, HP_MALLOC, DBG_MALLOC, FAST_LOCK-ADAPTIVE_WAIT
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535
poll method support: poll, epoll, sigio_rt, select.
main.c compiled on 12:26:27 Dec 20 2022 with gcc 4.8.5

Describe the bug

We have two cluster nodes (node1 and node2) with sharing anycast IP (anycast + b2b + clusterer). If the CANCEL message arrives not on the node on which the transaction was created, then it looks like the request itself is replicated to the owner of the transaction, but does not fall into the b2b context on it. When the b2b is off, everything works as it should.

To Reproduce

loadmodule "b2b_entities.so"
modparam("b2b_entities", "db_mode", 0)
modparam("b2b_entities", "cluster_id", 1)

loadmodule "b2b_logic.so"
modparam("b2b_logic", "db_mode", 0)
modparam("b2b_logic", "script_req_route", "b2b_logic_request")
modparam("b2b_logic", "script_reply_route", "b2b_logic_reply")

route[B2B_IN] {
    if ( is_method("INVITE") && !has_totag() ) {

        b2b_server_new("server1");
        b2b_client_new("client1", "sip:100@192.168.56.106:5080");
        b2b_init_request("prepaid");
        exit;
    }
}

route[b2b_logic_request] {
    b2b_pass_request();
    }

route[b2b_logic_reply] {
    b2b_handle_reply();
}

Relevant System Logs Attached are two syslog files:

opensips_node1_tm_owner.txt opensips_node2_rcv_CANCEL.txt

node2:

Feb 13 10:18:58 localhost.localdomain /usr/sbin/opensips[1465]: DBG:core:parse_msg: SIP Request:
Feb 13 10:18:58 localhost.localdomain /usr/sbin/opensips[1465]: DBG:core:parse_msg:  method:  <CANCEL>
Feb 13 10:18:58 localhost.localdomain /usr/sbin/opensips[1465]: DBG:b2b_entities:b2b_prescript_f: No dialog found for cancel
Feb 13 10:18:58 localhost.localdomain /usr/sbin/opensips[1465]: DBG:tm:matching_3261: RFC3261 transaction matching failed
Feb 13 10:18:58 localhost.localdomain /usr/sbin/opensips[1465]: DBG:tm:t_lookupOriginalT: no CANCEL matching found!
Feb 13 10:18:58 localhost.localdomain /usr/sbin/opensips[1465]: DBG:clusterer:match_node: matched node 1
Feb 13 10:18:58 localhost.localdomain /usr/sbin/opensips[1465]: DBG:proto_bin:proto_bin_send: sending via fd 5...
Feb 13 10:18:58 localhost.localdomain /usr/sbin/opensips[1465]: DBG:core:tsend_stream_async: Async successful write from first try on 0x7fc1b14b4868
Feb 13 10:18:58 localhost.localdomain /usr/sbin/opensips[1465]: DBG:proto_bin:proto_bin_send: after write: c= 0x7fc1b14b4868 n/len=539/539 fd=5
Feb 13 10:18:58 localhost.localdomain /usr/sbin/opensips[1465]: DBG:clusterer:msg_send_retry: sent bin packet to node [1]
Feb 13 10:18:58 localhost.localdomain /usr/sbin/opensips[1465]: DBG:core:destroy_avp_list: destroying list 0x7fc1b14b5f20
Feb 13 10:18:58 localhost.localdomain /usr/sbin/opensips[1465]: DBG:core:receive_msg: cleaning up

node1:

Feb 13 10:18:57 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:tm_repl_cancel: Got CANCEL with branch id=z9hG4bKcd852d71f8199ee9
Feb 13 10:18:57 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:t_lookupOriginalT: searching on hash entry 2771
Feb 13 10:18:57 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:matching_3261: RFC3261 transaction matched, tid=cd852d71f8199ee9
Feb 13 10:18:57 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:t_lookupOriginalT: canceled transaction found (0x7fbf44c658d8)!
Feb 13 10:18:57 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:t_lookupOriginalT: REF_UNSAFE:[0x7fbf44c658d8] after is 2
Feb 13 10:18:57 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:run_any_trans_callbacks: trans=0x7fbf44c658d8, callback type 2048, id 0 entered
Feb 13 10:18:57 localhost.localdomain /usr/sbin/opensips[1768]: DBG:dialog:init_dlg_term_reason: Setting DLG term reason to [Cancelled]
Feb 13 10:18:57 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:t_lookupOriginalT: t_lookupOriginalT completed
Feb 13 10:18:57 localhost.localdomain /usr/sbin/opensips[1768]: DBG:core:parse_msg: SIP Request:
Feb 13 10:18:57 localhost.localdomain /usr/sbin/opensips[1768]: DBG:core:parse_msg:  method:  <CANCEL>
Feb 13 10:18:57 localhost.localdomain /usr/sbin/opensips[1768]: DBG:core:parse_msg:  uri:     <sip:100@192.168.100.100;transport=udp>
Feb 13 10:18:58 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:tm_repl_cancel: successfully handled auto-CANCEL for 0x7fbf44c658d8
....
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:cancel_branch: sending cancel...
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:set_timer: relative timeout is 500000
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:insert_timer_unsafe: [4]: 0x7fbf44c68fe0 (1778400000)
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:insert_timer_unsafe: [0]: 0x7fbf44c69010 (1783)
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:final_response_handler: Cancel sent out, sending 408 (0x7fbf44c68cf8)
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:t_should_relay_response: T_code=180, new_code=408
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:t_pick_branch: picked branch 0, code 408 (prio=800)
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:is_3263_failure: dns-failover test: branch=0, last_recv=408, flags=2
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:local_reply: branch=0, save=0, winner=0
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:local_reply: local transaction completed
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:tm:run_any_trans_callbacks: trans=0x7fbf44c68cf8, callback type 256, id 0 entered
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:b2b_entities:b2b_tm_cback: tm [0x7fbf44c68cf8] notification cb for [408] reply
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:b2b_entities:b2b_parse_key: hash_index = [171]  - local_index= [3824181]
Feb 13 10:19:16 localhost.localdomain /usr/sbin/opensips[1768]: DBG:b2b_entities:b2b_tm_cback: Received reply [408] for dialog [0x7fbf44c5bde0], method [INVITE]

OS/environment information

denyspozniak commented 1 year ago

The presence of the Trying-2 indicates that the request did not fall into context b2b (this screenshot does not match the debug log from the message above):

image

github-actions[bot] commented 1 year ago

Any updates here? No progress has been made in the last 15 days, marking as stale. Will close this issue if no further updates are made in the next 30 days.

github-actions[bot] commented 1 year ago

Marking as closed due to lack of progress for more than 30 days. If this issue is still relevant, please re-open it with additional details.