Open efendizadeh opened 8 years ago
The ZRTP intercal sync makes sure that the state engine handle only one packet at a time for a stream. Stream means: one RTP in/out data flow, for example s end/receive audio.
Looking at the stack trace I see that bot threads perform a "zrtp_startZrtpEngine(..). Why this? Only one thread should start a ZRTP stack for one stream . If ZRTP should work for another stream, for example adding Video to the call, the you need to have a second ZRTP instance.
Werner
Am 31.03.2016 um 14:59 schrieb Jamil:
First of all thank you for great work. We are using pjsip in connection with ZRTP4PJ+ZRTPCPP for delivering secure calls. In development we faced with deadlock problem. After some investigation we found following situation:
As you can see from attached logs (start from thread 20). Pjsip initiates sending of rtp and passes it to zrtp transport over transport_send_rtp function for further processing. Via zrtp4pj API zrtpcpp processEvent(event_t *ev) method (zrtp/ZrtpStateClass.cpp Line 81). In processEvent method synchEnter is called, which locks zrtp mutex (let’s call it mutex_1). After that we returning back to pjsip transport_send_rtp where we have internal pjsip mutex lock (let’s call it mutex_2). At the same in thread 15, Pjsip, before calling srtp_rtp_cb, locks mutex_2, then process goes to start zrtp transport and we come in processEvent method's synchEnter call, where mutex_1 is already locked. Sample diagram attached.
After some investigation we decided that, internal zrtp synchronisation is not needed any more, pjsip is responsible for packet processing synchronisation now. We removed synchEnter and synchLeave and have tested most cases. At this time flight is success :) And the question is that, is this approach is correct?
Thread 15 2 pthread_mutex_lock () 3 pj_mutex_lock at /src/pj/os_core_unix.c:1265 4 zrtp_synchEnter at /zsrtp/transport_zrtp.c:768 5 ZrtpCallbackWrapper::synchEnter() at /zsrtp/zrtp/zrtp/ZrtpCallbackWrapper.cpp:121 6 ZRtp::synchEnter() at /zsrtp/zrtp/zrtp/ZRtp.cpp:2568 7 ZrtpStateClass::processEvent(Event*) at /zsrtp/zrtp/zrtp/ZrtpStateClass.cpp:86 8 ZRtp::startZrtpEngine() at /zsrtp/zrtp/zrtp/ZRtp.cpp:253 9 ::zrtp_startZrtpEngine(ZrtpContext *) at /zsrtp/zrtp/zrtp/ZrtpCWrapper.cpp:119 10 pjmedia_transport_zrtp_startZrtp at /zsrtp/transport_zrtp.c:859 11 transport_rtp_cb at /zsrtp/transport_zrtp.c:987 12 srtp_rtp_cb at /src/pjmedia/transport_srtp.c:971 13 ice_on_rx_data at /src/pjmedia/transport_ice.c:1779 14 ice_rx_data at /src/pjnath/ice_strans.c:1467 15 pj_ice_sess_on_rx_pkt at /src/pjnath/ice_session.c:3020 16 stun_on_rx_data at /src/pjnath/ice_strans.c:1508 17 on_data_recvfrom at /src/pjnath/stun_sock.c:957 18 ioqueue_on_read_complete at /src/pj/activesock.c:498 19 ioqueue_dispatch_read_event at /src/pj/ioqueue_common_abs.c:605 20 pj_ioqueue_poll at /src/pj/ioqueue_select.c:981 21 pjsip_endpt_handle_events2 at /src/pjsip/sip_endpoint.c:741 22 pjsua_handle_events at /src/pjsua-lib/pjsua_core.c:1858 23 worker_thread at /src/pjsua-lib/pjsua_core.c:696 24 thread_main at /src/pj/os_core_unix.c:541 25 _pthread_body () 26 _pthread_start () 27 thread_start () Thread 20 2 _pthread_mutex_lock_slow () 3 pj_mutex_lock at /src/pj/os_core_unix.c:1265 4 pj_lock_acquire at /src/pj/lock.c:180 5 grp_lock_acquire at /src/pj/lock.c:290 6 pj_grp_lock_acquire at /src/pj/lock.c:478 7 pj_ice_strans_sendto at /src/pjnath/ice_strans.c:1247 8 transport_send_rtp at /src/pjmedia/transport_ice.c:1655 9 pjmedia_transport_send_rtp at /include/pjmedia/transport.h:678 10 transport_send_rtp at /src/pjmedia/transport_srtp.c:853 11 pjmedia_transport_send_rtp at /include/pjmedia/transport.h:678 12 zrtp_sendDataZRTP at /zsrtp/transport_zrtp.c:461 13 ZrtpCallbackWrapper::sendDataZRTP(unsigned char const/, int) at /zsrtp/zrtp/zrtp/ZrtpCallbackWrapper.cpp:37 14 ZRtp::sendPacketZRTP(ZrtpPacketBase/) at /zsrtp/zrtp/zrtp/ZRtp.cpp:2577 15 ZrtpStateClass::evInitial() at /zsrtp/zrtp/zrtp/ZrtpStateClass.cpp:166 16 ZrtpStates::processEvent(ZrtpStateClass&) at /zsrtp/zrtp/zrtp/libzrtpcpp/ZrtpStates.h:68 17 ZrtpStateClass::processEvent(Event/) at /zsrtp/zrtp/zrtp/ZrtpStateClass.cpp:151 18 ZRtp::startZrtpEngine() at /zsrtp/zrtp/zrtp/ZRtp.cpp:253 19 ::zrtp_startZrtpEngine(ZrtpContext *) at /zsrtp/zrtp/zrtp/ZrtpCWrapper.cpp:119 20 pjmedia_transport_zrtp_startZrtp at /zsrtp/transport_zrtp.c:859 21 transport_send_rtp at /zsrtp/transport_zrtp.c:1154 22 pjmedia_transport_send_rtp at /include/pjmedia/transport.h:678 23 put_frame_imp at /src/pjmedia/stream.c:1382 24 put_frame at /src/pjmedia/stream.c:1511 25 pjmedia_port_put_frame at /src/pjmedia/port.c:115 26 write_port at /src/pjmedia/conference.c:1760 27 get_frame at /src/pjmedia/conference.c:1998 28 pjmedia_port_get_frame at /src/pjmedia/port.c:98 29 resample_get_frame at /src/pjmedia/resample_port.c:182 30 pjmedia_port_get_frame at /src/pjmedia/port.c:98 31 play_cb at /src/pjmedia/sound_port.c:88 32 output_renderer at /src/pjmedia-audiodev/coreaudio_dev.m:1073 33 AUInputElement::PullInput(unsigned long&, AudioTimeStamp const&, unsigned long, unsigned long) () 34 AUInputFormatConverter2::InputProc(OpaqueAudioConverter/, unsigned long/, AudioBufferList/, AudioStreamPacketDescription*/, void/) () 35 AudioConverterChain::CallInputProc(unsigned long) () 36 AudioConverterChain::FillBufferFromInputProc(unsigned long/, CABufferList/) () 37 BufferedAudioConverter::GetInputBytes(unsigned long, unsigned long&, CABufferList const/&) () 38 CBRConverter::RenderOutput(CABufferList/, unsigned long, unsigned long&, AudioStreamPacketDescription/) () 39 BufferedAudioConverter::FillBuffer(unsigned long&, AudioBufferList&, AudioStreamPacketDescription/) () 40 AudioConverterChain::RenderOutput(CABufferList/, unsigned long, unsigned long&, AudioStreamPacketDescription/) () 41 BufferedAudioConverter::FillBuffer(unsigned long&, AudioBufferList&, AudioStreamPacketDescription/) () 42 AudioConverterFillComplexBuffer () 43 AUInputFormatConverter2::PullAndConvertInput(AudioTimeStamp const&, unsigned long&, AudioBufferList&, AudioStreamPacketDescription/, bool&) () 44 AUConverterBase::RenderBus(unsigned long&, AudioTimeStamp const&, unsigned long, unsigned long) () 45 AUBase::DoRenderBus(unsigned long&, AudioTimeStamp const&, unsigned long, AUOutputElement/, unsigned long, AudioBufferList&) () 46 AUBase::DoRender(unsigned long&, AudioTimeStamp const&, unsigned long, unsigned long, AudioBufferList&) () 47 AURemoteIO::PerformIO(unsigned long, unsigned int, unsigned int, AudioTimeStamp const&, AudioTimeStamp const&, AudioBufferList const/, AudioBufferList/, int&) () 48 AURIOCallbackReceiver_PerformIO () 49 _XPerformIO () 50 mshMIGPerform () 51 MSHMIGDispatchMessage () 52 AURemoteIO::IOThread::Run() () 53 AURemoteIO::IOThread::Entry(void/) () 54 CAPThread::Entry(CAPThread*) () 55 _pthread_body () 56 _pthread_start () 57 thread_start ()
thread diagram https://cloud.githubusercontent.com/assets/7306284/14176319/0bf35ffa-f759-11e5-9bf6-98ba92ac93ae.png
— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/wernerd/ZRTPCPP/issues/31
Werner Dittmann email: Werner.Dittmann@t-online.de cell: +49 173 44 37 659 PGP key: 82EF5E8B
Situation seems like following: We are not manipulating directly pjsip threads. In this situation on outgoing call transport_send_rtp and transport_rtp_cb functions calls by pjsip in most case on the same time by different threads. In connection of this pjmedia_transport_zrtp_startZrtp in ZRTP4PJ (transport_zrtp.c) calls more than one time at the same time. It happens regarding to zrtp->started not covered by sync primitives (mutex).
For solving our problem we are thinking about possible solution to add synchronisation to zrtp->started checking and setting calls. As an architect and developer of this great software we need your advise.
Thanks in advance.
@efendizadeh did you get a solution for this? I'm seeing the same deadlock.
@aylesm see attached patch deadlock_patch.txt
@efendizadeh thanks. Looks like synchEnter/Leave are only called in that function. So commenting out the callback impl has the same effect as your patch.
First of all thank you for great work. We are using pjsip in connection with ZRTP4PJ+ZRTPCPP for delivering secure calls. In development we faced with deadlock problem. After some investigation we found following situation:
As you can see from attached logs (start from thread 20). Pjsip initiates sending of rtp and passes it to zrtp transport over transport_send_rtp function for further processing. Via zrtp4pj API zrtpcpp processEvent(event_t *ev) method (zrtp/ZrtpStateClass.cpp Line 81). In processEvent method synchEnter is called, which locks zrtp mutex (let’s call it mutex_1). After that we returning back to pjsip transport_send_rtp where we have internal pjsip mutex lock (let’s call it mutex_2). At the same in thread 15, Pjsip, before calling srtp_rtp_cb, locks mutex_2, then process goes to start zrtp transport and we come in processEvent method's synchEnter call, where mutex_1 is already locked. Sample diagram attached.
After some investigation we decided that, internal zrtp synchronisation is not needed any more, pjsip is responsible for packet processing synchronisation now. We removed synchEnter and synchLeave and have tested most cases. At this time flight is success :) And the question is that, is this approach is correct?