AllStarLink / app_rpt

Refactoring and upgrade of AllStarLink's app_rpt, etc.
8 stars 6 forks source link

core restart now #59

Closed tsawyer closed 1 year ago

tsawyer commented 1 year ago

core restart now doesn't always work. There is a CLI prompt but no response to any commands. This seems to happen mostly on the ASL3 server after it runs for a while. This server is the only one with echolink.

tsawyer commented 1 year ago

Asterisk CLI rpt fun 25338 *33009999 an echolink connect causes this issue. This might be fixed with the echolink fix.

InterLinked1 commented 1 year ago

Without looking at the issue at all, this is probably due to either a deadlock or channels not being able to be terminated / shutdown. core restart now will wait for all channels to get hung up and destroyed. If this doesn't happen, it will block a restart / shutdown forever.

This is still definitely a major bug but likely with one of the channel drivers.

InterLinked1 commented 1 year ago

This is indeed confirmed to all be because of deadlocks, possibly in Echolink:

[2023-01-14 08:57:26.912] WARNING[787109]: channel.c:1084 __ast_queue_frame: Exceptionally long voice queue length queuing to voter/25338
ASL3*CLI> core show locks

=======================================================================
=== 20.0.1
=== Currently Held Locks
=======================================================================
===
=== <pending> <lock#> (<file>): <lock type> <line num> <function> <lock name> <lock addr> (times locked)
===
=== Thread ID: 0x7fa4fa0cf700 LWP:787077 (do_devstate_changes  started at [  621] devicestate.c ast_device_state_engine_init())
=== ---> Lock #0 (astobj2_container.c): MUTEX 310 internal_ao2_traverse self 0x564a6364b640 (1)
        asterisk __ast_pthread_mutex_lock()
        asterisk __ao2_lock()
        asterisk <unknown>()
        asterisk __ao2_callback_data()
        asterisk ast_channel_callback()
        asterisk ast_channel_get_by_name_prefix()
        asterisk ast_parse_device_state()
        asterisk <unknown>()
        asterisk <unknown>()
        asterisk <unknown>()
        asterisk <unknown>()
        [0x7fa4fcba8ea7] libpthread.so.0 :0 __libpthread_freeres()
        [0x7fa4fc7c3a2f] libc.so.6 :0 clone()
=== ---> Waiting for Lock #1 (channel.c): MUTEX 1290 ast_channel_by_name_cb chan 0x7fa4c01148c0 (1)
        asterisk __ast_pthread_mutex_lock()
        asterisk __ao2_lock()
        asterisk <unknown>()
        asterisk <unknown>()
        asterisk __ao2_callback_data()
        asterisk ast_channel_callback()
        asterisk ast_channel_get_by_name_prefix()
        asterisk ast_parse_device_state()
        asterisk <unknown>()
        asterisk <unknown>()
        asterisk <unknown>()
        asterisk <unknown>()
        [0x7fa4fcba8ea7] libpthread.so.0 :0 __libpthread_freeres()
        [0x7fa4fc7c3a2f] libc.so.6 :0 clone()
=== --- ---> Locked Here: channel.c line 920 (__ast_channel_alloc_ap)
=== -------------------------------------------------------------------
===
=== Thread ID: 0x7fa4bba15700 LWP:787089 (do_monitor           started at [12025] chan_dahdi.c restart_monitor())
=== ---> Waiting for Lock #0 (chan_dahdi.c): MUTEX 11816 do_monitor &iflock 0x7fa4f862b780 (1)
        asterisk __ast_pthread_mutex_lock()
        [0x7fa4f85cbcda] chan_dahdi.so chan_dahdi.c:11817 do_monitor()
        asterisk <unknown>()
        [0x7fa4fcba8ea7] libpthread.so.0 :0 __libpthread_freeres()
        [0x7fa4fc7c3a2f] libc.so.6 :0 clone()
=== --- ---> Locked Here: chan_dahdi.c line 13757 (dahdi_request)
=== -------------------------------------------------------------------
===
=== Thread ID: 0x7fa4bb1b4700 LWP:787106 (el_reader            started at [ 3023] chan_echolink.c store_config())
=== ---> Lock #0 (chan_echolink.c): MUTEX 2564 el_reader &instp->lock 0x564a6371d7f0 (1)
        asterisk __ast_pthread_mutex_lock()
        [0x7fa4f850b9d0] chan_echolink.so chan_echolink.c:2565 el_reader()
        asterisk <unknown>()
        [0x7fa4fcba8ea7] libpthread.so.0 :0 __libpthread_freeres()
        [0x7fa4fc7c3a2f] libc.so.6 :0 clone()
=== ---> Lock #1 (channel.c): MUTEX 920 __ast_channel_alloc_ap tmp 0x7fa4e800dc60 (1)
        asterisk __ast_pthread_mutex_lock()
        asterisk __ao2_lock()
        asterisk <unknown>()
        asterisk __ast_channel_alloc()
        [0x7fa4f8508238] chan_echolink.so chan_echolink.c:1668 el_new()
        [0x7fa4f850aac8] chan_echolink.so chan_echolink.c:2400 do_new_call()
        [0x7fa4f850bf82] chan_echolink.so chan_echolink.c:2638 el_reader()
        asterisk <unknown>()
        [0x7fa4fcba8ea7] libpthread.so.0 :0 __libpthread_freeres()
        [0x7fa4fc7c3a2f] libc.so.6 :0 clone()
=== ---> Waiting for Lock #2 (channel.c): MUTEX 922 __ast_channel_alloc_ap channels 0x564a6364b640 (1)
        asterisk __ast_pthread_mutex_lock()
        asterisk __ao2_lock()
        asterisk __ao2_iterator_next()
        asterisk ast_channel_iterator_next()
        asterisk <unknown>()
        asterisk ast_cli_command_full()
        asterisk ast_cli_command_multiple_full()
        asterisk <unknown>()
        asterisk <unknown>()
        [0x7fa4fcba8ea7] libpthread.so.0 :0 __libpthread_freeres()
        [0x7fa4fc7c3a2f] libc.so.6 :0 clone()
=== --- ---> Locked Here: astobj2_container.c line 310 (internal_ao2_traverse)
=== -------------------------------------------------------------------
===
=== Thread ID: 0x7fa4ba742700 LWP:787112 (rpt                  started at [ 5895] app_rpt.c rpt_master())
=== ---> Lock #0 (channel.c): MUTEX 920 __ast_channel_alloc_ap tmp 0x7fa4c01148c0 (1)
        asterisk __ast_pthread_mutex_lock()
        asterisk __ao2_lock()
        asterisk <unknown>()
        asterisk __ast_channel_alloc()
        [0x7fa4f8508238] chan_echolink.so chan_echolink.c:1668 el_new()
        [0x7fa4f8508613] chan_echolink.so chan_echolink.c:1725 el_request()
        asterisk <unknown>()
        asterisk ast_request()
        [0x7fa4f91a1f75] app_rpt.so rpt_link.c:664 connect_link()
        [0x7fa4f9199281] app_rpt.so rpt_functions.c:137 function_ilink()
        [0x7fa4f9169c43] app_rpt.so app_rpt.c:1493 collect_function_digits()
        [0x7fa4f916e7e3] app_rpt.so app_rpt.c:2601 local_dtmf_helper()
        [0x7fa4f9179026] app_rpt.so app_rpt.c:4398 rpt()
        asterisk <unknown>()
        [0x7fa4fcba8ea7] libpthread.so.0 :0 __libpthread_freeres()
        [0x7fa4fc7c3a2f] libc.so.6 :0 clone()
=== ---> Lock #1 (chan_dahdi.c): MUTEX 13757 dahdi_request &iflock 0x7fa4f862b780 (1)
        asterisk __ast_pthread_mutex_lock()
        [0x7fa4f85d2acd] chan_dahdi.so chan_dahdi.c:13758 dahdi_request()
        asterisk <unknown>()
        asterisk ast_request()
        [0x7fa4f91a219b] app_rpt.so rpt_link.c:688 connect_link()
        [0x7fa4f9199281] app_rpt.so rpt_functions.c:137 function_ilink()
        [0x7fa4f9169c43] app_rpt.so app_rpt.c:1493 collect_function_digits()
        [0x7fa4f916e7e3] app_rpt.so app_rpt.c:2601 local_dtmf_helper()
        [0x7fa4f9179026] app_rpt.so app_rpt.c:4398 rpt()
        asterisk <unknown>()
        [0x7fa4fcba8ea7] libpthread.so.0 :0 __libpthread_freeres()
        [0x7fa4fc7c3a2f] libc.so.6 :0 clone()
=== ---> Lock #2 (channel.c): MUTEX 920 __ast_channel_alloc_ap tmp 0x7fa4c012ffd0 (1)
        asterisk __ast_pthread_mutex_lock()
        asterisk __ao2_lock()
        asterisk <unknown>()
        asterisk __ast_channel_alloc()
        [0x7fa4f85c8314] chan_dahdi.so chan_dahdi.c:9408 dahdi_new()
        [0x7fa4f85d2e89] chan_dahdi.so chan_dahdi.c:13847 dahdi_request()
        asterisk <unknown>()
        asterisk ast_request()
        [0x7fa4f91a219b] app_rpt.so rpt_link.c:688 connect_link()
        [0x7fa4f9199281] app_rpt.so rpt_functions.c:137 function_ilink()
        [0x7fa4f9169c43] app_rpt.so app_rpt.c:1493 collect_function_digits()
        [0x7fa4f916e7e3] app_rpt.so app_rpt.c:2601 local_dtmf_helper()
        [0x7fa4f9179026] app_rpt.so app_rpt.c:4398 rpt()
        asterisk <unknown>()
        [0x7fa4fcba8ea7] libpthread.so.0 :0 __libpthread_freeres()
        [0x7fa4fc7c3a2f] libc.so.6 :0 clone()
=== ---> Waiting for Lock #3 (channel.c): MUTEX 922 __ast_channel_alloc_ap channels 0x564a6364b640 (1)
        asterisk __ast_pthread_mutex_lock()
        asterisk __ao2_lock()
        asterisk __ao2_iterator_next()
        asterisk ast_channel_iterator_next()
        asterisk <unknown>()
        asterisk ast_cli_command_full()
        asterisk ast_cli_command_multiple_full()
        asterisk <unknown>()
        asterisk <unknown>()
        [0x7fa4fcba8ea7] libpthread.so.0 :0 __libpthread_freeres()
        [0x7fa4fc7c3a2f] libc.so.6 :0 clone()
=== --- ---> Locked Here: astobj2_container.c line 310 (internal_ao2_traverse)
=== -------------------------------------------------------------------
===
=== Thread ID: 0x7fa4fa62e700 LWP:791474 (netconsole           started at [ 1569] asterisk.c listener())
=== ---> Waiting for Lock #0 (astobj2_container.c): MUTEX 589 __ao2_iterator_next iter->c 0x564a6364b640 (1)
        asterisk __ast_pthread_mutex_lock()
        asterisk __ao2_lock()
        asterisk __ao2_iterator_next()
        asterisk ast_channel_iterator_next()
        asterisk <unknown>()
        asterisk ast_cli_command_full()
        asterisk ast_cli_command_multiple_full()
        asterisk <unknown>()
        asterisk <unknown>()
        [0x7fa4fcba8ea7] libpthread.so.0 :0 __libpthread_freeres()
        [0x7fa4fc7c3a2f] libc.so.6 :0 clone()
=== --- ---> Locked Here: astobj2_container.c line 310 (internal_ao2_traverse)
=== -------------------------------------------------------------------
===
=======================================================================