pjsip / pjproject

PJSIP project
http://www.pjsip.org
GNU General Public License v2.0
2.02k stars 770 forks source link

Invite session ref count add assertion failure #2443

Closed kharwell closed 4 years ago

kharwell commented 4 years ago

It appears at least in part, the fix for issue 2240 that went into 2.10 has caused a regression.

This will causes a crash for us every time we run one of Asterisk's testsuite tests. The test sends an INVITE to a non-responsive address:port. After timing out a failover occurs, and the test attempts to send an INVITE to a responsive address:port. However, after configuring Asterisk to use pjproject 2.10 the following crash is observed when executing that test:

#1  0x00007fac273d6801 in __GI_abort () at abort.c:79
#2  0x00007fac273c639a in __assert_fail_base (fmt=0x7fac2754d7d8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7fac2a124209 "inv && inv->ref_cnt", file=file@entry=0x7fac2a1241ef "../src/pjsip-ua/sip_inv.c", line=line@entry=203, function=function@entry=0x7fac2a1250c0 <__PRETTY_FUNCTION__.8603> "pjsip_inv_add_ref") at assert.c:92
#3  0x00007fac273c6412 in __GI___assert_fail (assertion=0x7fac2a124209 "inv && inv->ref_cnt", file=0x7fac2a1241ef "../src/pjsip-ua/sip_inv.c", line=203, function=0x7fac2a1250c0 <__PRETTY_FUNCTION__.8603> "pjsip_inv_add_ref") at assert.c:101
#4  0x00007fac2a012984 in pjsip_inv_add_ref (inv=0x55b521d1b1c8) at ../src/pjsip-ua/sip_inv.c:203
#5  0x00007fac2a012d38 in inv_set_state (inv=0x55b521d1b1c8, state=PJSIP_INV_STATE_EARLY, e=0x7faba4a90a40) at ../src/pjsip-ua/sip_inv.c:321
#6  0x00007fac2a01ab7c in inv_on_state_calling (inv=0x55b521d1b1c8, e=0x7faba4a90a40) at ../src/pjsip-ua/sip_inv.c:4228
#7  0x00007fac2a013751 in mod_inv_on_tsx_state (tsx=0x55b521cdca08, e=0x7faba4a90a40) at ../src/pjsip-ua/sip_inv.c:726
#8  0x00007fac2a06c618 in pjsip_dlg_on_tsx_state (dlg=0x55b521c98628, tsx=0x55b521cdca08, e=0x7faba4a90a40) at ../src/pjsip/sip_dialog.c:2129
#9  0x00007fac2a06d07d in mod_ua_on_tsx_state (tsx=0x55b521cdca08, e=0x7faba4a90a40) at ../src/pjsip/sip_ua_layer.c:178
#10 0x00007fac2a0633fa in tsx_set_state (tsx=0x55b521cdca08, state=PJSIP_TSX_STATE_PROCEEDING, event_src_type=PJSIP_EVENT_RX_MSG, event_src=0x7fab744744d8, flag=0) at ../src/pjsip/sip_transaction.c:1272
#11 0x00007fac2a066b4c in tsx_on_state_proceeding_uac (tsx=0x55b521cdca08, event=0x7faba4a90b30) at ../src/pjsip/sip_transaction.c:2975
#12 0x00007fac2a06608c in tsx_on_state_calling (tsx=0x55b521cdca08, event=0x7faba4a90b30) at ../src/pjsip/sip_transaction.c:2599
#13 0x00007fac2a0646b5 in pjsip_tsx_recv_msg (tsx=0x55b521cdca08, rdata=0x7fab744744d8) at ../src/pjsip/sip_transaction.c:1832
#14 0x00007fac2a0627ce in mod_tsx_layer_on_rx_response (rdata=0x7fab744744d8) at ../src/pjsip/sip_transaction.c:893
#15 0x00007fac2a0434ce in pjsip_endpt_process_rx_data (endpt=0x55b52058f3c8, rdata=0x7fab744744d8, p=0x7fabb285ab80 <param>, p_handled=0x7faba4a90c6c) at ../src/pjsip/sip_endpoint.c:938
#16 0x00007fabb262c65e in distribute (data=0x7fab744744d8) at res_pjsip/pjsip_distributor.c:955
#17 0x000055b51eb5b158 in ast_taskprocessor_execute (tps=0x55b520733e20) at taskprocessor.c:1237
#18 0x000055b51eb69389 in execute_tasks (data=0x55b520733e20) at threadpool.c:1354
#19 0x000055b51eb5b158 in ast_taskprocessor_execute (tps=0x55b520528310) at taskprocessor.c:1237
#20 0x000055b51eb66cb9 in threadpool_execute (pool=0x55b52035c570) at threadpool.c:367
#21 0x000055b51eb68ab9 in worker_active (worker=0x7fab7c001450) at threadpool.c:1137
#22 0x000055b51eb68829 in worker_start (arg=0x7fab7c001450) at threadpool.c:1056
#23 0x000055b51eb72b2e in dummy_start (data=0x7fab7c001690) at utils.c:1249
#24 0x00007fac282c66db in start_thread (arg=0x7faba4a91700) at pthread_create.c:463
#25 0x00007fac274b788f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

When I revert this patch the test passes successfully, and no crash occurs.

Here's some more info that might be helpful (from the gdb core file):

$2 = (pjsip_inv_session *) 0x7f4cd804d588
(gdb) p inv->ref_cnt
$3 = (pj_atomic_t *) 0x0
(gdb) p *inv
$4 = {obj_name = "inv0x7f4cd80138d8", '\000' <repeats 14 times>, pool = 0x7f4cd8013830, pool_prov = 0x0, pool_active = 0x0, state = PJSIP_INV_STATE_EARLY, 
  cancelling = 0, pending_cancel = 0, pending_bye = 0x0, cause = PJSIP_SC_REQUEST_TIMEOUT, cause_text = {ptr = 0x7f4cd8019b48 "Request TimeoutU\210\325\004\330L\177", 
    slen = 15}, notify = 1, cb_called = 74, dlg = 0x7f4cd80138d8, role = PJSIP_ROLE_UAC, options = 3, neg = 0x7f4cd804d850, sdp_neg_flags = 1, 
  invite_tsx = 0x55f454ccb688, invite_req = 0x0, last_answer = 0x0, last_ack = 0x0, last_ack_cseq = 0, mod_data = {0x0 <repeats 18 times>, 0x7f4cd80697f0, 
    0x0 <repeats 13 times>}, timer = 0x7f4cd804d7d0, following_fork = 0, ref_cnt = 0x0, updated_sdp_answer = 0}

Also I'll attach the backtrace, and Asterisk debug log for the test run that, while containing a bunch of Asterisk related stuff, also contains the full SIP trace of the "calls", and I also enabled the pjproject debugging so log data from there should also be included. Jump to the end of the file to see things just before the crash.

sauwming commented 4 years ago

Could you please try the attached patch and check if it resolves the issue?

sip_inv_early_add_ref.txt

kharwell commented 4 years ago

I tested the patch, and unfortunately it seems to still fail. Looks like still in the same way:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f74cb61c859 in __GI_abort () at abort.c:79
#2  0x00007f74cb61c729 in __assert_fail_base (fmt=0x7f74cb7b2588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x7f74cc382b09 "inv && inv->ref_cnt", file=0x7f74cc382aef "../src/pjsip-ua/sip_inv.c", line=203, function=<optimized out>) at assert.c:92
#3  0x00007f74cb62df36 in __GI___assert_fail (assertion=0x7f74cc382b09 "inv && inv->ref_cnt", file=0x7f74cc382aef "../src/pjsip-ua/sip_inv.c", line=203, function=0x7f74cc383980 <__PRETTY_FUNCTION__.41> "pjsip_inv_add_ref") at assert.c:101
#4  0x00007f74cc270656 in pjsip_inv_add_ref (inv=0x564a74c4e8e8) at ../src/pjsip-ua/sip_inv.c:203
#5  0x00007f74cc270a12 in inv_set_state (inv=0x564a74c4e8e8, state=PJSIP_INV_STATE_EARLY, e=0x7f74799d6a40) at ../src/pjsip-ua/sip_inv.c:321
#6  0x00007f74cc27890e in inv_on_state_calling (inv=0x564a74c4e8e8, e=0x7f74799d6a40) at ../src/pjsip-ua/sip_inv.c:4232
#7  0x00007f74cc27144a in mod_inv_on_tsx_state (tsx=0x7f7448007728, e=0x7f74799d6a40) at ../src/pjsip-ua/sip_inv.c:726
#8  0x00007f74cc2ca783 in pjsip_dlg_on_tsx_state (dlg=0x564a74c4e158, tsx=0x7f7448007728, e=0x7f74799d6a40) at ../src/pjsip/sip_dialog.c:2129
#9  0x00007f74cc2cb202 in mod_ua_on_tsx_state (tsx=0x7f7448007728, e=0x7f74799d6a40) at ../src/pjsip/sip_ua_layer.c:178
#10 0x00007f74cc2c14bf in tsx_set_state (tsx=0x7f7448007728, state=PJSIP_TSX_STATE_PROCEEDING, event_src_type=PJSIP_EVENT_RX_MSG, event_src=0x7f74482d5ac8, flag=0) at ../src/pjsip/sip_transaction.c:1272
#11 0x00007f74cc2c4c30 in tsx_on_state_proceeding_uac (tsx=0x7f7448007728, event=0x7f74799d6b30) at ../src/pjsip/sip_transaction.c:2975
#12 0x00007f74cc2c4167 in tsx_on_state_calling (tsx=0x7f7448007728, event=0x7f74799d6b30) at ../src/pjsip/sip_transaction.c:2599
#13 0x00007f74cc2c2795 in pjsip_tsx_recv_msg (tsx=0x7f7448007728, rdata=0x7f74482d5ac8) at ../src/pjsip/sip_transaction.c:1832
#14 0x00007f74cc2c0863 in mod_tsx_layer_on_rx_response (rdata=0x7f74482d5ac8) at ../src/pjsip/sip_transaction.c:893
#15 0x00007f74cc2a16a5 in pjsip_endpt_process_rx_data (endpt=0x564a73f1cfe8, rdata=0x7f74482d5ac8, p=0x7f7480ab36a0 <param>, p_handled=0x7f74799d6c6c) at ../src/pjsip/sip_endpoint.c:938
#16 0x00007f7480a7d00c in distribute (data=0x7f74482d5ac8) at res_pjsip/pjsip_distributor.c:955
#17 0x0000564a730ae060 in ast_taskprocessor_execute (tps=0x564a73b15b90) at taskprocessor.c:1237
#18 0x0000564a730bc6ea in execute_tasks (data=0x564a73b15b90) at threadpool.c:1354
#19 0x0000564a730ae060 in ast_taskprocessor_execute (tps=0x564a73f19150) at taskprocessor.c:1237
#20 0x0000564a730b9f24 in threadpool_execute (pool=0x564a73f18e00) at threadpool.c:367
#21 0x0000564a730bbddc in worker_active (worker=0x7f7450001160) at threadpool.c:1137
#22 0x0000564a730bbb3f in worker_start (arg=0x7f7450001160) at threadpool.c:1056
#23 0x0000564a730c60b0 in dummy_start (data=0x7f7450001100) at utils.c:1249
#24 0x00007f74cbadc609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#25 0x00007f74cb719103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
sauwming commented 4 years ago

Please try again with this new patch here: sip_inv_patch.txt

kharwell commented 4 years ago

I have tested the patch, and we are no longer experiencing the crash. I'd say it seems to resolve the issue.

Thanks much for the fix!