Open FatimaJaffal opened 12 months ago
Following up:
In core-asterisk-running-2023-11-13T13-51-36Z-brief.txt
dump there are a lot of threads that seem to have a corruption in the stackframe:
thread apply all bt
to get all the backtraces.#11 0x0000557d57f2fcf0 in __ao2_lock (user_data=0x0, lock_how=AO2_LOCK_REQ_MUTEX, file=0x9408ed9800007fad <error: Cannot access memory at address 0x9408ed9800007fad>, func=0x5805983800000000 <error: Cannot access memory at address 0x5805983800000000>, line=32684, var=0x0) at astobj2.c:253
cat bts | grep -B13 corrupt | grep Thread | awk '{print $6}' | sed "s/)//" | sort -n
for x in $(cat lwps); do grep $x threads; done
Here are the results, almost all those threads are at exactly the same place:
0x7faf6d769640 1581969 do_monitor started at [30114] chan_sip.c restart_monitor()
0x7fab6a5a1640 4061002 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab55c2b640 4084886 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5a22f640 4085611 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7facb2fe2640 4085748 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5be81640 4085887 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab4a812640 4085926 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5f531640 4085930 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab431d6640 4085969 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7facb0ac6640 4086050 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5c845640 4086106 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7facb1dd1640 4086369 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab683f0640 4086639 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5e41a640 4087371 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab2b1d6640 4087659 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5e708640 4087862 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab54870640 4087945 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7facb148a640 4087953 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab40824640 4087998 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab58c36640 4088003 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5a2ac640 4088088 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5a59a640 4088134 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab58e2a640 4088293 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fae5305f640 4088335 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7faf6d4f8640 4088510 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab43e88640 4088525 bridge_channel_depart_thread started at [ 1848] bridge.c bridge_impart_internal()
0x7fab57495640 4088853 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab60855640 4089200 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab6148a640 4089368 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab531bc640 4089549 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab60d37640 4090008 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab62a83640 4090085 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab6075b640 4090240 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab6a812640 4090362 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab4891e640 4090398 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5f437640 4090470 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab6967e640 4090597 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab4aa06640 4090599 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab4b9a6640 4090708 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7facb0cba640 4090737 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab541e6640 4090833 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab6af65640 4090838 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5db50640 4090951 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7faf6c170640 4091179 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab42e6b640 4091261 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab4288f640 4091293 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab4b7b2640 4091316 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5f149640 4091386 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab6a795640 4091498 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab50741640 4091507 bridge_channel_depart_thread started at [ 1848] bridge.c bridge_impart_internal()
0x7fab5c93f640 4091546 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab567e3640 4091601 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab6b541640 4091602 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab6b34d640 4091654 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab55f96640 4091809 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5b922640 4091820 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab4916b640 4091822 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab2bc17640 4091832 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab6b5be640 4091910 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5c74b640 4092017 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab63929640 4092092 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab4935f640 4092105 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fae508ed640 4092157 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab6a69b640 4092197 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab58a42640 4092235 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7facb37b2640 4092292 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab6996c640 4092321 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5b5b7640 4092358 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab5ae64640 4092363 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab59500640 4092445 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab6a236640 4092474 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab2b63b640 4092503 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab6bd11640 4092571 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7fab68085640 4092589 ari_channel_thread started at [ 1929] ari/resource_channels.c ast_ari_channels_create()
0x7facb1872640 4092652 bridge_channel_depart_thread started at [ 1848] bridge.c bridge_impart_internal()
Thread 1041 (Thread 0x7facb1872640 (LWP 4092652) "asterisk"):
#0 0x00007faf71e91059 in __futex_abstimed_wait_common (cancel=true, private=0, abstime=0x7faf71e91059 <__GI___futex_abstimed_wait_cancelable64+41>, clockid=<optimized out>, expected=0, futex_word=0x557d59e626fc) at ./nptl/futex-internal.c:83
#1 __GI___futex_abstimed_wait_cancelable64 (futex_word=0x557d59e626fc, expected=0, clockid=<optimized out>, abstime=0x7faf71e91059 <__GI___futex_abstimed_wait_cancelable64+41>, private=0) at ./nptl/futex-internal.c:139
#2 0x0000557d581a8b50 in ?? ()
#3 0x0000557d581a7ec1 in ?? ()
#4 0x0000557d59e626f0 in ?? ()
#5 0x0000557d581a8b50 in ?? ()
#6 0x00000452e009cdc8 in ?? ()
#7 0x0000557d581a7e59 in ast_stream_topology_changed_external ()
#8 0x00000002b1871be0 in ?? ()
#9 0x0000000059e5c498 in ?? ()
#10 0x00007facb1871c00 in ?? ()
#11 0x0000557d57f2fcf0 in __ao2_lock (user_data=0x0, lock_how=AO2_LOCK_REQ_MUTEX, file=0x9408ed9800007fad <error: Cannot access memory at address 0x9408ed9800007fad>, func=0x5805983800000000 <error: Cannot access memory at address 0x5805983800000000>, line=32684, var=0x0) at astobj2.c:253
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Thread 1033 (Thread 0x7fab68085640 (LWP 4092589) "asterisk"):
#0 0x00007faf71e91059 in __futex_abstimed_wait_common (cancel=true, private=0, abstime=0x7faf71e91059 <__GI___futex_abstimed_wait_cancelable64+41>, clockid=<optimized out>, expected=0, futex_word=0x557d59e626fc) at ./nptl/futex-internal.c:83
#1 __GI___futex_abstimed_wait_cancelable64 (futex_word=0x557d59e626fc, expected=0, clockid=<optimized out>, abstime=0x7faf71e91059 <__GI___futex_abstimed_wait_cancelable64+41>, private=0) at ./nptl/futex-internal.c:139
#2 0x0000557d581a8b50 in ?? ()
#3 0x0000557d581a7ec1 in ?? ()
#4 0x0000557d59e626f0 in ?? ()
#5 0x0000557d581a8b50 in ?? ()
#6 0x00000452e016d508 in ?? ()
#7 0x0000557d581a7e59 in ast_stream_topology_changed_external ()
#8 0x00000002680845c0 in ?? ()
#9 0x0000000059e5c498 in ?? ()
#10 0x00007fab680845e0 in ?? ()
#11 0x0000557d57f2fcf0 in __ao2_lock (user_data=0x0, lock_how=AO2_LOCK_REQ_MUTEX, file=0x9408ed9800007fad <error: Cannot access memory at address 0x9408ed9800007fad>, func=0x5805983800000000 <error: Cannot access memory at address 0x5805983800000000>, line=32684, var=0x0) at astobj2.c:253
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Thread 1031 (Thread 0x7fab6bd11640 (LWP 4092571) "asterisk"):
#0 0x00007faf71e91059 in __futex_abstimed_wait_common (cancel=true, private=0, abstime=0x7faf71e91059 <__GI___futex_abstimed_wait_cancelable64+41>, clockid=<optimized out>, expected=0, futex_word=0x557d59e626fc) at ./nptl/futex-internal.c:83
#1 __GI___futex_abstimed_wait_cancelable64 (futex_word=0x557d59e626fc, expected=0, clockid=<optimized out>, abstime=0x7faf71e91059 <__GI___futex_abstimed_wait_cancelable64+41>, private=0) at ./nptl/futex-internal.c:139
#2 0x0000557d581a8b50 in ?? ()
#3 0x0000557d581a7ec1 in ?? ()
#4 0x0000557d59e626f0 in ?? ()
#5 0x0000557d581a8b50 in ?? ()
#6 0x000004529800fbb8 in ?? ()
#7 0x0000557d581a7e59 in ast_stream_topology_changed_external ()
#8 0x000000026bd105c0 in ?? ()
#9 0x0000000059e5c498 in ?? ()
#10 0x00007fab6bd105e0 in ?? ()
#11 0x0000557d57f2fcf0 in __ao2_lock (user_data=0x0, lock_how=AO2_LOCK_REQ_MUTEX, file=0x9408ed9800007fad <error: Cannot access memory at address 0x9408ed9800007fad>, func=0x5805983800000000 <error: Cannot access memory at address 0x5805983800000000>, line=32684, var=0x0) at astobj2.c:253
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Following up with our latest findings:
We tried various steps to debug and fix the issue:
Downgraded Asterisk to 18.12.1: Result: The issue persisted. Coredump files: core-asterisk-running-2023-12-26T12-15-26Z-full.txt core-asterisk-running-2023-12-26T12-15-26Z-brief.txt
Reverted OPUS Configurations: Removed the OPUS configurations applied before the issue (see below). Result: The issue was resolved.
OPUS configs:
codecs.conf
[opus]
type=opus
signal=voice
fec=yes
max_bandwidth=narrow
max_playback_rate=8000
cbr=yes
bitrate=8000
complexity=4
packet_loss=0
modules.conf
load=res_format_attr_opus
Conclusion: Enabling FEC for OPUS seems to be causing the issue.
Severity
Major
Versions
18.17.1, 20.5.0
Components/Modules
chan_sip,bridge,channel,ari
Operating Environment
EC2 Ubuntu 22.04.2 LTS jammy Asterisk built from sources
Frequency of Occurrence
Occasional
Issue Description
After a few days of running Asterisk on the server it stops responding to SIP signals. This start happening occasionally and we're not facing unprecedented load.
We noticed the following:
We generated core dumps on different servers when the issue happened:
core-asterisk-running-2023-11-05T20-11-47Z-brief.txt core-asterisk-running-2023-11-05T20-11-47Z-full.txt
core-asterisk-running-2023-11-07T12-13-34Z-full.txt core-asterisk-running-2023-11-07T12-13-34Z-brief.txt
core-asterisk-running-2023-11-13T13-51-36Z-brief.txt core-asterisk-running-2023-11-13T13-51-36Z-full.txt
We tried to attach the core dump (
core-asterisk-running-2023-11-07T12-13-34Z
) to gdb (See output log below) and noticed the following:System details:
We are using Asterisk with ARI Nodejs library to create and handle channels/bridges.
Asterisk config files: modules.txt ari.txt codecs.txt extensions.txt sip.txt
Call Flow:
application
.channel#create
endpoint to create a new channel and add it to the bridge.Relevant log output
Asterisk Issue Guidelines