Closed MayamaTakeshi closed 6 months ago
Of more than a thousand test executions, only the ones for mrcp_and_audio.js (which is the only one that writes/reads TCP data) got crashes:
takeshi@takeshi-desktop:~/src/git/MayamaTakeshi/sip-lab$ cat res.txt | grep 0 | wc -l
1175
takeshi@takeshi-desktop:~/src/git/MayamaTakeshi/sip-lab$ cat res.txt | grep 0 -v | wc -l
20
takeshi@takeshi-desktop:~/src/git/MayamaTakeshi/sip-lab$ cat res.txt | grep -v 0 | sort
samples/mrcp_and_audio.js: 134
samples/mrcp_and_audio.js: 134
samples/mrcp_and_audio.js: 134
samples/mrcp_and_audio.js: 134
samples/mrcp_and_audio.js: 134
samples/mrcp_and_audio.js: 139
samples/mrcp_and_audio.js: 139
samples/mrcp_and_audio.js: 139
samples/options.js: 1
samples/options.js: 1
samples/options.js: 1
samples/options.js: 1
samples/options.js: 1
samples/register_subscribe.js: 1
samples/register_subscribe.js: 1
samples/register_subscribe.js: 1
samples/register_subscribe.js: 1
samples/register_subscribe.js: 1
samples/register_subscribe.js: 1
samples/two_audio_media.js: 1
So indeed there is an implementation bug.
Testing client and sides separately:
Testing speech synth client against mrcp_server get crashes:
07:42:22.671
07:42:22.671 wait (b.js:127) got event:
07:42:22.671 {
event: 'mrcp_msg',
call_id: 0,
msg: 'MRCP/2.0 142 SPEAK-COMPLETE 1 COMPLETE\r\n' +
'channel-identifier: dc7318a7-d51a-46f0-af98-aa893c351428@speechsynth\r\n' +
'Completion-Cause: 000 normal\r\n' +
'\r\n'
}
07:42:22.671
07:42:22.671 Trying match against expected_events[0]:
07:42:22.672 partial_match({
event: 'mrcp_msg',
call_id: 0,
msg: mrcp_msg({
type: 'event',
version: '2.0',
request_id: 1,
event_name: 'SPEAK-COMPLETE',
request_state: 'COMPLETE'
})
})
malloc_consolidate(): invalid chunk size
./r2.sh: line 10: 13312 Aborted (core dumped) node b.js
Testing speech synth server with calls generated by mrcp_client also get crashes:
channel-identifier: 32AECB234338@speechsynth
speech-language: dtmf
content-type: text/plain
content-length: 16
1234567890abcd*#
on_data_read msg_size=155
on_data_read msg={"event": "mrcp_msg", "call_id": 10}
MRCP/2.0 155 SPEAK 1
channel-identifier: 32AECB234338@speechsynth
speech-language: dtmf
content-type: text/plain
content-length: 16
1234567890abcd*#
dispach_event called with evt={"event": "mrcp_msg", "call_id": 10}
MRCP/2.0 155 SPEAK 1
channel-identifier: 32AECB234338@speechsynth
speech-language: dtmf
content-type: text/plain
content-length: 16
1234567890abcd*#
09:22:18.639
09:22:18.639 wait (c.js:101) got event:
09:22:18.639 {
event: 'mrcp_msg',
call_id: 10,
msg: 'MRCP/2.0 155 SPEAK 1\r\n' +
'channel-identifier: 32AECB234338@speechsynth\r\n' +
'speech-language: dtmf\r\n' +
'content-type: text/plain\r\n' +
'content-length: 16\r\n' +
'\r\n' +
'1234567890abcd*#'
}
09:22:18.640
09:22:18.640 Trying match against expected_events[0]:
Segmentation fault (core dumped)
We are seeing crashes also in node when running the mrcp test:
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f41de4a547e in __GI___libc_free (mem=0x400000000000) at ./malloc/malloc.c:3368
3368 ./malloc/malloc.c: No such file or directory.
[Current thread is 1 (Thread 0x7f41dea497c0 (LWP 53432))]
(gdb) bt
#0 0x00007f41de4a547e in __GI___libc_free (mem=0x400000000000) at ./malloc/malloc.c:3368
#1 0x00007f41deaa9aae in free (ptr=<optimized out>) at ../include/rtld-malloc.h:50
#2 __GI__dl_deallocate_tls (tcb=tcb@entry=0x7f41d5832640, dealloc_tcb=dealloc_tcb@entry=false) at ../elf/dl-tls.c:646
#3 0x00007f41de4914c5 in __nptl_free_stacks (limit=41943040) at ./nptl/nptl-stack.c:78
#4 0x00007f41de491625 in queue_stack (stack=0x7f41dcbfc640) at ./nptl/nptl-stack.c:106
#5 __GI___nptl_deallocate_stack (pd=pd@entry=0x7f41dcbfc640) at ./nptl/nptl-stack.c:123
#6 0x00007f41de4917fd in __GI___nptl_free_tcb (pd=0x7f41dcbfc640) at ./nptl/nptl_free_tcb.c:41
#7 0x00007f41de49661c in __pthread_clockjoin_ex (threadid=139920853157440, thread_return=thread_return@entry=0x0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, block=block@entry=true)
at ./nptl/pthread_join_common.c:128
#8 0x00007f41de496543 in ___pthread_join (threadid=<optimized out>, thread_return=thread_return@entry=0x0) at ./nptl/pthread_join.c:24
#9 0x000000000165b25e in uv_thread_join (tid=<optimized out>) at ../deps/uv/src/unix/thread.c:273
#10 0x0000000000bd8b49 in node::WorkerThreadsTaskRunner::Shutdown() ()
#11 0x0000000000bd8ef8 in node::NodePlatform::Shutdown() ()
#12 0x0000000000ab196a in node::DefaultProcessExitHandler(node::Environment*, int) ()
#13 0x0000000000af1733 in node::Environment::Exit(int) ()
#14 0x0000000000da5fa0 in v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) ()
#15 0x0000000000da74df in v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) ()
#16 0x00000000016e9af9 in Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit ()
#17 0x000000000166da50 in Builtins_InterpreterEntryTrampoline ()
#18 0x0000060a671415b9 in ?? ()
#19 0x000020282f1860d9 in ?? ()
#20 0x0000000600000000 in ?? ()
#21 0x0000060a67141689 in ?? ()
#22 0x000020282f1827d9 in ?? ()
#23 0x0000000000000000 in ?? ()
(gdb)
But this cannot be reported as they say (in issue-34174):
If find node_modules/ -name \*.node finds any files, chances are good the bug stems from one of those. That first stack trace is most likely caused by an earlier memory corruption and native modules are usually the source of those.
I tried to use send_key as part of MrcpEndpoint struct and allocate memory for the data from g_pool to make sure it doesn't happen because they are release too soon. But the crashes continue to happen.
I also tried a 1 second delay after answer before sending the first MRCP message but the crash persists.
We should isolate the activesock code in a main.c file and see this reproducible without node.