drachtio / drachtio-freeswitch-modules

A collection of open-sourced freeswitch modules that I use in various drachtio applications
MIT License
172 stars 118 forks source link

[mod_google_transcribe]: Freeswitch crash while ending google transcription #17

Open vikash-plivo opened 4 years ago

vikash-plivo commented 4 years ago

Code base is latest but with line numbers would not be exactly same. gdb backtrace:

(gdb) bt

0 writesDone (this=0x0) at google_glue.cpp:143

1 google_speech_session_cleanup (session=0x7f8758a4c5b8, channelIsClosing=1) at google_glue.cpp:398

2 0x00007f8779581f8c in capture_callback (bug=0x7f86880d1068, user_data=0x7f8688005c70, type=SWITCH_ABC_TYPE_CLOSE) at mod_google_transcribe.c:65

3 0x00007f87811cebbd in switch_core_media_bug_close (bug=0x7f8664432b48, destroy=SWITCH_FALSE) at src/switch_core_media_bug.c:1263

4 0x00007f87811ceeb1 in switch_core_media_bug_remove_all_function (session=0x7f8758a4c5b8, function=0x0) at src/switch_core_media_bug.c:1231

5 0x00007f87811eb1a9 in switch_core_session_hangup_state (session=0x7f8758a4c5b8, force=SWITCH_TRUE) at src/switch_core_state_machine.c:839

6 0x00007f87811ecccd in switch_core_session_run (session=0x7f8758a4c5b8) at src/switch_core_state_machine.c:616

7 0x00007f87811e6fce in switch_core_session_thread (thread=, obj=0x7f8758a4c5b8) at src/switch_core_session.c:1709

8 0x00007f87811e2a1d in switch_core_session_thread_pool_worker (thread=0x7f86873dcdc0, obj=0x80) at src/switch_core_session.c:1772

9 0x00007f87816e28e0 in dummy_worker (opaque=0x7f86873dcdc0) at threadproc/unix/thread.c:151

10 0x00007f87806ad064 in start_thread (arg=0x7f8664433700) at pthread_create.c:309

11 0x00007f877fd8462d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

(gdb) bt full

0 writesDone (this=0x0) at google_glue.cpp:143

No locals.

1 google_speech_session_cleanup (session=0x7f8758a4c5b8, channelIsClosing=1) at google_glue.cpp:398

    cb = 0x7f8688005c70
    streamer = 0x0
    st = SWITCH_STATUS_SUCCESS
    channel = 0x7f8758c23250
    time_elapsed = 0
    var = <optimized out>
    final_voice_duration = 0x0
    __func__ = "google_speech_session_cleanup"
    bug = 0x7f86880d1068
    speech_executions = 0
    voice_duration = <optimized out>

2 0x00007f8779581f8c in capture_callback (bug=0x7f86880d1068, user_data=0x7f8688005c70, type=SWITCH_ABC_TYPE_CLOSE) at mod_google_transcribe.c:65

    session = 0x7f8758a4c5b8

3 0x00007f87811cebbd in switch_core_media_bug_close (bug=0x7f8664432b48, destroy=SWITCH_FALSE) at src/switch_core_media_bug.c:1263

    bp = 0x7f86880d1068
    __func__ = "switch_core_media_bug_close"

4 0x00007f87811ceeb1 in switch_core_media_bug_remove_all_function (session=0x7f8758a4c5b8, function=0x0) at src/switch_core_media_bug.c:1231

    bp = 0x7f86880d1068
    status = 2282557544
    __func__ = "switch_core_media_bug_remove_all_function"

5 0x00007f87811eb1a9 in switch_core_session_hangup_state (session=0x7f8758a4c5b8, force=SWITCH_TRUE) at src/switch_core_state_machine.c:839

    cause = SWITCH_CAUSE_NORMAL_UNSPECIFIED
    cause_q850 = SWITCH_CAUSE_NORMAL_UNSPECIFIED
    proceed = 31
    global_proceed = 31
    midstate = CS_HANGUP
    endpoint_interface = 0x0
    driver_state_handler = 0x7f87790ad080 <sofia_event_handlers>
    hook_var = 0x1 <error: Cannot access memory at address 0x1>
    use_session = 0
    __func__ = "switch_core_session_hangup_state"
    __PRETTY_FUNCTION__ = "switch_core_session_hangup_state"

6 0x00007f87811ecccd in switch_core_session_run (session=0x7f8758a4c5b8) at src/switch_core_state_machine.c:616

    ptr = 0x0
    midstate = CS_HANGUP
    endstate = CS_NEW
    endpoint_interface = 0x0
    driver_state_handler = 0x7f87790ad080 <sofia_event_handlers>
    __PRETTY_FUNCTION__ = "switch_core_session_run"
    __func__ = "switch_core_session_run"

7 0x00007f87811e6fce in switch_core_session_thread (thread=, obj=0x7f8758a4c5b8) at src/switch_core_session.c:1709

    session = 0x7f8758a4c5b8
    event = 0x4c4b40
    event_str = 0x0

---Type to continue, or q to quit--- val = func = "switch_core_session_thread" __PRETTY_FUNCTION__ = "switch_core_session_thread"

8 0x00007f87811e2a1d in switch_core_session_thread_pool_worker (thread=0x7f86873dcdc0, obj=0x80) at src/switch_core_session.c:1772

    td = 0x7f875921a760
    pop = 0x7f875921a760
    check_status = 1495377760
    pool = 0x7f86873dcb68
    __func__ = "switch_core_session_thread_pool_worker"

9 0x00007f87816e28e0 in dummy_worker (opaque=0x7f86873dcdc0) at threadproc/unix/thread.c:151

    thread = 0x7f86873dcdc0

10 0x00007f87806ad064 in start_thread (arg=0x7f8664433700) at pthread_create.c:309

    __res = <optimized out>
    pd = 0x7f8664433700
    now = <optimized out>
    unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140215184471808, 817260727955083411, 0, 140217799025520, 20, 140215184471808, -838875922313260909, -839782135128010605},
          mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
    not_first_call = <optimized out>
    pagesize_m1 = <optimized out>
    sp = <optimized out>
    freesize = <optimized out>
    __PRETTY_FUNCTION__ = "start_thread"

11 0x00007f877fd8462d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

No locals.

davehorton commented 4 years ago

you're right .... the line numbers are way off. How is that you think this is the latest code?

The thing is, you are crashing because the streamer point is null (0x0). But in the latest code, at line 325, we check to make sure we have a non-null streamer.

vikash-plivo commented 4 years ago

yes, In my code check for non-null streamer is present. but it seems that writesDone is called twice somehow.

davehorton commented 4 years ago

So it sounds to me like you have made changes to the code base, since the line numbers don't line up. If you can recreate on the latest version of my code, without changes, I'll be happy to troubleshoot further.

vikash-plivo commented 4 years ago

Sure, I will rebuild it with your latest code. I just added some code for supporting timeout. I will raise the PR once I test it.

davehorton commented 4 years ago

It will probably best if you build freeswitch with symbols and don't optimize: include these on your ./configure command to freeswitch: CPPFLAGS='-g -O0' CXXFLAGS='-g -O0'

vikash-plivo commented 4 years ago

I tested with your latest code, following is the backtrace: (gdb) bt full

0 0x00007f0033663430 in ?? ()

No symbol table info available.

1 0x00007f0159d18ee1 in grpc_read_thread (thread=, obj=0x7f00800956c0) at google_glue.cpp:239

    __func__ = "grpc_read_thread"
    status = {static OK = @0x7f015f92fa70, static CANCELLED = @0x7f015f92fa50, code_ = grpc::OUT_OF_RANGE,
      error_message_ = "Audio Timeout Error: Long duration elapsed without audio. Audio should be sent close to real time.", binary_error_details_ = ""}
    cb = 0x7f00800956c0
    streamer = 0x7f00800bde30
    response = {<google::protobuf::Message> = {<google::protobuf::MessageLite> = {
          _vptr.MessageLite = 0x7f016250d510 <vtable for google::cloud::speech::v1::StreamingRecognizeResponse+16>}, <No data fields>}, static kIndexInFileMessages = 12,
      static SPEECH_EVENT_UNSPECIFIED = google::cloud::speech::v1::StreamingRecognizeResponse_SpeechEventType_SPEECH_EVENT_UNSPECIFIED,
      static END_OF_SINGLE_UTTERANCE = google::cloud::speech::v1::StreamingRecognizeResponse_SpeechEventType_END_OF_SINGLE_UTTERANCE,
      static SpeechEventType_MIN = google::cloud::speech::v1::StreamingRecognizeResponse_SpeechEventType_SPEECH_EVENT_UNSPECIFIED,
      static SpeechEventType_MAX = google::cloud::speech::v1::StreamingRecognizeResponse_SpeechEventType_END_OF_SINGLE_UTTERANCE, static SpeechEventType_ARRAYSIZE = 2,
      static kResultsFieldNumber = 2, static kErrorFieldNumber = 1, static kSpeechEventTypeFieldNumber = 4,
      _internal_metadata_ = {<google::protobuf::internal::InternalMetadataWithArenaBase<google::protobuf::UnknownFieldSet, google::protobuf::internal::InternalMetadataWithArena>> = {
          ptr_ = 0x0, static kPtrTagMask = <optimized out>, static kPtrValueMask = <optimized out>}, <No data fields>},
      results_ = {<google::protobuf::internal::RepeatedPtrFieldBase> = {static kInitialSize = 0, arena_ = 0x0, current_size_ = 0, total_size_ = 0, static kRepHeaderSize = 8,
          rep_ = 0x0}, <No data fields>}, error_ = 0x7f0124aaf980, speech_event_type_ = 0, _cached_size_ = {size_ = {<std::__atomic_base<int>> = {_M_i = 0}, <No data fields>}}}

2 0x00007f016215f0d0 in dummy_worker (opaque=0x7f0080095c88) at threadproc/unix/thread.c:151

    thread = 0x7f0080095c88

3 0x00007f016114f064 in start_thread (arg=0x7f0047e1f700) at pthread_create.c:309

    __res = <optimized out>
    pd = 0x7f0047e1f700
    now = <optimized out>
    unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139639182718720, 8546960599011921914, 0, 139643147808368, 20, 139639182718720, -8602886141846384646, -8602397245233983494},
          mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
    not_first_call = <optimized out>
    pagesize_m1 = <optimized out>
    sp = <optimized out>
    freesize = <optimized out>
    __PRETTY_FUNCTION__ = "start_thread"

4 0x00007f016082662d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

No locals. (gdb)

davehorton commented 4 years ago

It sounds like part of recreating is that I need to recreate this error:

"Audio Timeout Error: Long duration elapsed without audio. Audio should be sent close to real time."

How are you doing that? This should only happen if for some reason Freeswitch is not receiving RTP. Is that happening in your test?

vikash-plivo commented 4 years ago

I am introducing 500 ms delay on network interface. in this case when i am stopping the delay freeswitch is crashing. I am using tc qdisc command.

davehorton commented 4 years ago

can you please give me the exact command sequence, and the timing with which you are orchestrating it so I can recreate

vikash-plivo commented 4 years ago

so I am running around 50 calls with google transcribe on a server and set following command on my that machine: tc qdisc add dev eth0 root netem delay 500ms

This command will introduce 500 ms delay. after running for almost 30 mins, i delete aforementioned rule with following command: tc qdisc del dev eth0 root it crashes around this time, sometimes it crashes in few seconds also.

davehorton commented 4 years ago

do you happen to know if it also crashes in the same setup with a single call? Or do I need to generate load?

vikash-plivo commented 4 years ago

it happens with some load

davehorton commented 4 years ago

what would be useful would be to print out some variables from that stack trace, assuming you can load it in gdb:

info locals p cb p streamer

vikash-plivo commented 4 years ago

(gdb) frame 1

1 0x00007f0159d18ee1 in grpc_read_thread (thread=, obj=0x7f00800956c0) at google_glue.cpp:239

239 cb->responseHandler(cb->session, "no_audio"); (gdb) frame 1

1 0x00007f0159d18ee1 in grpc_read_thread (thread=, obj=0x7f00800956c0) at google_glue.cpp:239

239 cb->responseHandler(cb->session, "no_audio"); (gdb) info locals func = "grpc_readthread" status = {static OK = @0x7f015f92fa70, static CANCELLED = @0x7f015f92fa50, code = grpc::OUT_OF_RANGE, errormessage = "Audio Timeout Error: Long duration elapsed without audio. Audio should be sent close to real time.", binary_errordetails = ""} cb = 0x7f00800956c0 streamer = 0x7f00800bde30 response = { = { = { _vptr.MessageLite = 0x7f016250d510 <vtable for google::cloud::speech::v1::StreamingRecognizeResponse+16>}, }, static kIndexInFileMessages = 12, static SPEECH_EVENT_UNSPECIFIED = google::cloud::speech::v1::StreamingRecognizeResponse_SpeechEventType_SPEECH_EVENT_UNSPECIFIED, static END_OF_SINGLE_UTTERANCE = google::cloud::speech::v1::StreamingRecognizeResponse_SpeechEventType_END_OF_SINGLE_UTTERANCE, static SpeechEventType_MIN = google::cloud::speech::v1::StreamingRecognizeResponse_SpeechEventType_SPEECH_EVENT_UNSPECIFIED, static SpeechEventType_MAX = google::cloud::speech::v1::StreamingRecognizeResponse_SpeechEventType_END_OF_SINGLE_UTTERANCE, static SpeechEventType_ARRAYSIZE = 2, static kResultsFieldNumber = 2, static kErrorFieldNumber = 1, static kSpeechEventTypeFieldNumber = 4, _internalmetadata = {<google::protobuf::internal::InternalMetadataWithArenaBase<google::protobuf::UnknownFieldSet, google::protobuf::internal::InternalMetadataWithArena>> = {ptr = 0x0, static kPtrTagMask = , static kPtrValueMask = }, }, results = { = {static kInitialSize = 0, arena_ = 0x0, currentsize = 0, totalsize = 0, static kRepHeaderSize = 8, rep = 0x0}, }, error = 0x7f0124aaf980, speech_eventtype = 0, _cachedsize = { size_ = {<std::atomic_base> = {_M_i = 0}, }}} (gdb) p cb $2 = {mutex = 0x7f0033663430, session = 0x6161646633666434, base = 0x31312d353631302d <error: Cannot access memory at address 0x31312d353631302d>, resampler = 0x2d663162382d6165, streamer = 0x3432383830623139, responseHandler = 0x7f0033663430, thread = 0x6161646633666434, end_of_utterance = 909193261} (gdb) p streamer $3 = {m_session = 0x7f0135730618, m_context = {initial_metadatareceived = true, wait_forready = false, wait_for_ready_explicitlyset = false, idempotent = false, cacheable = false, channel_ = warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<grpc::Channel*, (gnu_cxx::_Lock_policy)2>' warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<grpc::Channel*, (__gnu_cxx::_Lock_policy)2>'

std::sharedptr (count 3, weak 1) 0x7f008009fe30, mu = { = {_M_mutex = {data = {lock = 0, count = 0, owner = 0, nusers = 0, kind = 0, spins = 0, elision = 0, list = {prev = 0x0, next = 0x0}}, size = '\000' <repeats 39 times>, align = 0}}, }, call_ = 0x7f0080042820, callcanceled = false, deadline_ = { tv_sec = 9223372036854775807, tv_nsec = 0, clock_type = GPR_CLOCKREALTIME}, authority = "", creds_ = std::shared_ptr (empty) 0x0, authcontext = std::shared_ptr (empty) 0x0, censuscontext = 0x0, send_initialmetadata = std::multimap with 0 elements, recv_initialmetadata = {filled = false, arr = {count = 3, capacity = 3, metadata = 0x7f003dc0ae70}, map_ = std::multimap with 0 elements}, trailingmetadata = {filled = false, arr = {count = 1, capacity = 1, metadata = 0x7f01258d6650}, map_ = std::multimap with 0 elements}, propagate_fromcall = 0x0, propagationoptions = {propagate_ = 65535}, compressionalgorithm = GRPC_COMPRESS_NONE, initial_metadatacorked = false, debug_errorstring = "{\"created\":\"@1573134615.919894267\",\"description\":\"Error received from peer\",\"file\":\"src/core/lib/surface/call.cc\",\"file_line\":1036,\"grpc_message\":\"Audio Timeout Error: Long duration elapsed without au"..., rpcinfo = {ctx = 0x7f00800bde38, type = grpc::experimental::ClientRpcInfo::BIDISTREAMING, method = 0x7f01621c57a0 "/google.cloud.speech.v1.Speech/StreamingRecognize", channel = 0x7f008009fe30, interceptors = std::vector of length 0, capacity 0, hijacked_ = false, hijackedinterceptor = 0}}, m_creds = warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<grpc::SecureChannelCredentials*, (gnu_cxx::_Lock_policy)2>' warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<grpc::SecureChannelCredentials, (__gnu_cxx::_Lock_policy)2>' std::shared_ptr (count 1, weak 0) 0x7f0080091440, m_channel = warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<grpc::Channel, (gnu_cxx::_Lock_policy)2>' warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<grpc::Channel*, (gnu_cxx::_Lock_policy)2>' std::shared_ptr (count 3, weak 1) 0x7f008009fe30, m_stub = std::unique_ptr containing 0x7f008000e860, m_streamer = std::unique_ptr<grpc::ClientReaderWriterInterface<google::cloud::speech::v1::StreamingRecognizeRequest, google::cloud::speech::v1::StreamingRecognizeResponse>> containing 0x7f00800bb2b0, m_request = { = { = { _vptr.MessageLite = 0x7f016250cb10 <vtable for google::cloud::speech::v1::StreamingRecognizeRequest+16>}, }, static kIndexInFileMessages = 2, static kStreamingConfigFieldNumber = 1, static kAudioContentFieldNumber = 2, _internalmetadata = {<google::protobuf::internal::InternalMetadataWithArenaBase<google::protobuf::UnknownFieldSet, google::protobuf::internal::InternalMetadataWithArena>> = {ptr_ = 0x0, static kPtrTagMask = , static kPtrValueMask = }, }, streamingrequest = {streamingconfig = 0x7f008009f360, audiocontent = { ptr_ = 0x7f008009f360}}, _cachedsize = {size_ = {<std::__atomic_base> = {_M_i = 70}, }}, _oneofcase = {1}}, m_writesDone = false} (gdb)

davehorton commented 4 years ago

and also

p *(cb->session)

vikash-plivo commented 4 years ago

(gdb) p *(cb->session) Cannot access memory at address 0x6161646633666434

vikash-plivo commented 4 years ago

if you want i can check in last frame?

davehorton commented 4 years ago

I think I see the problem. That cb object holds a pointer to the freeswitch session, but I think the session has been destroyed/hungup at this point. I have an idea on how to fix, but will probably need your help to test

vikash-plivo commented 4 years ago

yeah, it seems freeswitch session was destroyed. I will definitely help you testing this case.

davehorton commented 4 years ago

can you retest with the latest commit, which has a fix for this?

vikash-plivo commented 4 years ago

Sure, I will test with the latest change and confirm.

vikash-plivo commented 4 years ago

I tested with latest codebase, but it seems it crashed at some different location now: [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/local/freeswitch/bin/freeswitch -nc -rp -core'. Program terminated with signal SIGABRT, Aborted.

0 0x00007f05a9ac4067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56

56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt

0 0x00007f05a9ac4067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56

1 0x00007f05a9ac5448 in __GI_abort () at abort.c:89

2 0x00007f05a9b021b4 in __libc_message (do_abort=do_abort@entry=1, fmt=fmt@entry=0x7f05a9bf7210 " Error in `%s': %s: 0x%s \n") at ../sysdeps/posix/libc_fatal.c:175

3 0x00007f05a9b0798e in malloc_printerr (action=1, str=0x7f05a9bf7360 "double free or corruption (out)", ptr=) at malloc.c:4996

4 0x00007f05a9b08696 in _int_free (av=, p=, have_lock=0) at malloc.c:3840

5 0x00007f05a8ec43ad in speex_resampler_destroy () from /usr/lib/x86_64-linux-gnu/libspeexdsp.so.1

6 0x00007f05a306a610 in google_speech_session_cleanup (session=0x7f0581586f88, channelIsClosing=1) at google_glue.cpp:348

7 0x00007f05a3068bcc in capture_callback (bug=0x7f0500054388, user_data=0x7f04ab67afa0, type=SWITCH_ABC_TYPE_CLOSE) at mod_google_transcribe.c:66

8 0x00007f05aaf9c39d in switch_core_media_bug_close (bug=0x7f048c8e3b48, destroy=SWITCH_FALSE) at src/switch_core_media_bug.c:1263

9 0x00007f05aaf9c691 in switch_core_media_bug_remove_all_function (session=0x7f0581586f88, function=0x0) at src/switch_core_media_bug.c:1231

10 0x00007f05aafb8989 in switch_core_session_hangup_state (session=0x7f0581586f88, force=SWITCH_TRUE) at src/switch_core_state_machine.c:839

11 0x00007f05aafba4ad in switch_core_session_run (session=0x7f0581586f88) at src/switch_core_state_machine.c:616

12 0x00007f05aafb47ae in switch_core_session_thread (thread=, obj=0x7f0581586f88) at src/switch_core_session.c:1709

13 0x00007f05aafb01fd in switch_core_session_thread_pool_worker (thread=0x7f05805b7250, obj=0x2f74) at src/switch_core_session.c:1772

14 0x00007f05ab4b00d0 in dummy_worker (opaque=0x7f05805b7250) at threadproc/unix/thread.c:151

15 0x00007f05aa4a0064 in start_thread (arg=0x7f048c8e4700) at pthread_create.c:309

16 0x00007f05a9b7762d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

(gdb) bt full

0 0x00007f05a9ac4067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56

    resultvar = 0
    pid = 3330
    selftid = 12148

1 0x00007f05a9ac5448 in __GI_abort () at abort.c:89

    save_stage = 2
    act = {__sigaction_handler = {sa_handler = 0x7f05ab4c1716, sa_sigaction = 0x7f05ab4c1716}, sa_mask = {__val = {26, 3, 139657514727088, 139662325600170, 139661615823840, 139658011709728,
          139662325456785, 139662325454614, 139662319971855, 5777399676922519875, 5200946612426664306, 7092453966807917673, 7589691705817854565, 139657975169056, 139658010523872, 139658010523856}},
      sa_flags = 112, sa_restorer = 0x59}
    sigs = {__val = {32, 0 <repeats 15 times>}}

2 0x00007f05a9b021b4 in __libc_message (do_abort=do_abort@entry=1, fmt=fmt@entry=0x7f05a9bf7210 " Error in `%s': %s: 0x%s \n") at ../sysdeps/posix/libc_fatal.c:175

    ap = {{gp_offset = 40, fp_offset = 32516, overflow_arg_area = 0x7f048c8e3980, reg_save_area = 0x7f048c8e3910}}
    fd = 2
    on_2 = <optimized out>
    list = <optimized out>
    nlist = <optimized out>
    cp = <optimized out>
    written = <optimized out>

3 0x00007f05a9b0798e in malloc_printerr (action=1, str=0x7f05a9bf7360 "double free or corruption (out)", ptr=) at malloc.c:4996

    buf = "00007f050000fe10"
    cp = <optimized out>

4 0x00007f05a9b08696 in _int_free (av=, p=, have_lock=0) at malloc.c:3840

    size = <optimized out>
    fb = <optimized out>
    nextchunk = <optimized out>
    nextsize = <optimized out>
    nextinuse = <optimized out>
    prevsize = <optimized out>
    bck = <optimized out>
    fwd = <optimized out>
    errstr = <optimized out>
    locked = <optimized out>
    __func__ = "_int_free"

5 0x00007f05a8ec43ad in speex_resampler_destroy () from /usr/lib/x86_64-linux-gnu/libspeexdsp.so.1

No symbol table info available.

6 0x00007f05a306a610 in google_speech_session_cleanup (session=0x7f0581586f88, channelIsClosing=1) at google_glue.cpp:348

    cb = 0x7f04ab67afa0
    streamer = 0x0
    channel = 0x7f0580647fd0
    bug = 0x7f0500054388
    __func__ = "google_speech_session_cleanup"

7 0x00007f05a3068bcc in capture_callback (bug=0x7f0500054388, user_data=0x7f04ab67afa0, type=SWITCH_ABC_TYPE_CLOSE) at mod_google_transcribe.c:66

    session = 0x7f0581586f88

8 0x00007f05aaf9c39d in switch_core_media_bug_close (bug=0x7f048c8e3b48, destroy=SWITCH_FALSE) at src/switch_core_media_bug.c:1263

    bp = 0x7f0500054388
    __func__ = "switch_core_media_bug_close"

9 0x00007f05aaf9c691 in switch_core_media_bug_remove_all_function (session=0x7f0581586f88, function=0x0) at src/switch_core_media_bug.c:1231

    bp = 0x7f0500054388
    status = 344968

---Type to continue, or q to quit--- func = "switch_core_media_bug_remove_all_function"

10 0x00007f05aafb8989 in switch_core_session_hangup_state (session=0x7f0581586f88, force=SWITCH_TRUE) at src/switch_core_state_machine.c:839

    cause = SWITCH_CAUSE_NORMAL_UNSPECIFIED
    cause_q850 = SWITCH_CAUSE_NORMAL_UNSPECIFIED
    proceed = 31
    global_proceed = 31
    midstate = CS_HANGUP
    endpoint_interface = 0x0
    driver_state_handler = 0x7f05a2b94080 <sofia_event_handlers>
    hook_var = 0x1 <error: Cannot access memory at address 0x1>
    use_session = 6
    __func__ = "switch_core_session_hangup_state"
    __PRETTY_FUNCTION__ = "switch_core_session_hangup_state"

11 0x00007f05aafba4ad in switch_core_session_run (session=0x7f0581586f88) at src/switch_core_state_machine.c:616

    ptr = 0x0
    midstate = CS_HANGUP
    endstate = CS_NEW
    endpoint_interface = 0x0
    driver_state_handler = 0x7f05a2b94080 <sofia_event_handlers>
    __PRETTY_FUNCTION__ = "switch_core_session_run"
    __func__ = "switch_core_session_run"

12 0x00007f05aafb47ae in switch_core_session_thread (thread=, obj=0x7f0581586f88) at src/switch_core_session.c:1709

    session = 0x7f0581586f88
    event = 0x4c4b40
    event_str = 0x0
    val = <optimized out>
    __func__ = "switch_core_session_thread"
    __PRETTY_FUNCTION__ = "switch_core_session_thread"

13 0x00007f05aafb01fd in switch_core_session_thread_pool_worker (thread=0x7f05805b7250, obj=0x2f74) at src/switch_core_session.c:1772

    td = 0x7f0580fffaa0
    pop = 0x7f0580fffaa0
    check_status = 2164259488
    pool = 0x7f05805b6ff8
    __func__ = "switch_core_session_thread_pool_worker"

14 0x00007f05ab4b00d0 in dummy_worker (opaque=0x7f05805b7250) at threadproc/unix/thread.c:151

    thread = 0x7f05805b7250

15 0x00007f05aa4a0064 in start_thread (arg=0x7f048c8e4700) at pthread_create.c:309

    __res = <optimized out>
    pd = 0x7f048c8e4700
    now = <optimized out>
    unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139657514731264, -1606563490908950013, 0, 139660461901120, 20, 139657514731264, 1710609299161657859, 1711090301895488003}, mask_was_saved = 0}},
      priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
    not_first_call = <optimized out>
    pagesize_m1 = <optimized out>
    sp = <optimized out>
    freesize = <optimized out>
    __PRETTY_FUNCTION__ = "start_thread"

16 0x00007f05a9b7762d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

davehorton commented 4 years ago

interesting, I guess I will have to try to recreate. Can you give me as specific as possible instructions on how to recreate your exact load test?

vikash-plivo commented 4 years ago

I am running around 50 calls with google transcribe on a server and set following command on my that machine: tc qdisc add dev eth0 root netem delay 500ms

This command will introduce a 500 ms delay. after running for almost 30 mins, I delete the aforementioned rule with following command: tc qdisc del dev eth0 root it crashes around this time, sometimes it crashes in few seconds also.

davehorton commented 4 years ago

I tried running 50 calls with those commands and so far have been unable to recreate. Just wondering, are you using PCMU or some other codec ?

vikash-plivo commented 4 years ago

I am using PCMU.