Open ivanoff946 opened 5 years ago
@ivanoff946, please print the followings in gdb, frame 0 :
p *hdr
p ((contact_body_t *)hdr->parsed)->contacts
@bogdan-iancu Sorry, i loose my core file! Soon as possible i attach new trace file. Once again, I'm sorry.
Hello @bogdan-iancu ! I attach new trace here. It is the same problem. trace3.txt
Thank you!
p hdr p ((contact_body_t )hdr->parsed)->contacts
for trace3.txt:
(gdb) f 0
#0 0x00007f4034612931 in fake_req (faked_req=0x7f403483b580 <faked_req.12035>, uac=0x0, with_dst=0, uas=0x7f403dfacfd0, shm_msg=0x7f403e0bc3b0) at t_msgbuilder.h:238
238
(gdb) p *hdr
$1 = {type = HDR_CONTACT_T, name = {
s = 0x7f403e0bccb4 "Contact: <sip:79135951595@1.1.1.11;did=173.6ef77b04>\r\nTo: <sip:user2@400122741.domain.ru>\r\nFrom: \"79135951595\"<sip:79135951595@domain.ru>;tag=a520d934\r\nCall-ID: rQjCm3WUdeSudVPqjoJkAQ..\r\nCS"..., len = 7}, body = {
s = 0x7f403e0bccbd "<sip:79135951595@1.1.1.11;did=173.6ef77b04>\r\nTo: <sip:user2@400122741.domain.ru>\r\nFrom: \"79135951595\"<sip:79135951595@domain.ru>;tag=a520d934\r\nCall-ID: rQjCm3WUdeSudVPqjoJkAQ..\r\nCSeq: 1 INV"..., len = 46}, len = 57, parsed = 0x0, next = 0x7f403e0bd230, sibling = 0x0}
(gdb) p ((contact_body_t *)hdr->parsed)->contacts
Cannot access memory at address 0x8
(gdb)
@ivanoff946 , according to the traces, the crash happens in a notification route (via event based routing), when injecting some new branches - are you doing a notify_on_event("E_UL_AOR_INSERT"..)
? or ?
Yes, i have this code:
if ((is_from_gw("9"))||($si == "OPENSIPS_HOST")) {
xlog("L_INFO", "Call from main opensips");
$var(search_user) = $tU;
$var(search_domain) = $td;
$var(search_type) = "dst";
route(get_user);
route(lookup);
route(rabbitmq);
t_newtran();
t_wait_for_new_branches();
$avp(filter) = "aor=" + $tU + "@" + $td;
notify_on_event("E_UL_CONTACT_INSERT","$avp(filter)","INSERT_CT","120");
route(RELAY);
Hello @bogdan-iancu! I think i understood the meaning of the problem. I use this route with notify_on_event (from above code):
route[INSERT_CT] {
xlog("L_INFO", "[INSERT_CT] New contact [$avp(uri)] registered for user [$avp(aor)], inserting new branch to ongoing call\r\n");
usleep("1000");
t_inject_branches("event");
}
Opensips crashes, when one worker calls t_inject_branches("event"); repeatedly for same "Contact". For example, logs:
Dec 18 09:40:41 opensips /usr/sbin/opensips[4196]: [INSERT_CT] New contact [sip:v0011@1.1.188.12:38909;ob] registered for user [v0011@example.domain.com], inserting new branch to ongoing call
Dec 18 09:40:41 opensips /usr/sbin/opensips[4201]: [INSERT_CT] New contact [sip:v0011@1.1.188.12:38909;ob] registered for user [v0011@example.domain.com], inserting new branch to ongoing call
Dec 18 09:40:41 opensips /usr/sbin/opensips[4196]: [INSERT_CT] New contact [sip:v0011@1.1.188.12:38909;ob] registered for user [v0011@example.domain.com], inserting new branch to ongoing call
Dec 18 09:40:41 opensips /usr/sbin/opensips[4201]: [INSERT_CT] New contact [sip:v0011@1.1.188.12:38909;ob] registered for user [v0011@example.domain.com], inserting new branch to ongoing call
Dec 18 09:40:41 opensips /usr/sbin/opensips[4196]: [INSERT_CT] New contact [sip:v0011@1.1.188.12:38909;ob] registered for user [v0011@example.domain.com], inserting new branch to ongoing call
Dec 18 09:40:41 opensips /usr/sbin/opensips[4201]: [INSERT_CT] New contact [sip:v0011@1.1.188.12:38909;ob] registered for user [v0011@example.domain.com], inserting new branch to ongoing call
Dec 18 09:40:41 opensips /usr/sbin/opensips[4245]: CRITICAL:core:handle_worker: dead child 23 (EOF received), pid 4196
Dec 18 09:40:41 opensips /usr/sbin/opensips[4140]: INFO:core:handle_sigs: child process 4196 exited by a signal 7
Dec 18 09:40:41 opensips /usr/sbin/opensips[4140]: INFO:core:handle_sigs: core was not generated
Dec 18 09:40:41 opensips /usr/sbin/opensips[4140]: INFO:core:handle_sigs: terminating due to SIGCHLD
Here worker with pid 4196 calls t_inject_branches("event"); for same contact twice and after that worker crashes. I have statistics from many crashing, and everywhere i see that one worker twice calls t_inject_branches("event"); for same contact and after that worker is crashes.
Question is - why E_UL_CONTACT_INSERT raises multiple times on same contact. As said in documentation - "This event is raised when a new contact is inserted in any of the existing AOR's contact list.", so imo it should raises once per new contact insertion.
Maybe the event is raised only once, but there are multiple subscribers for it (due notify_on_event
on different calls). Keep in mind that the subscription does not "die" when the corresponding transaction is ended - the subscription to events are transaction independent, they have their lifetime dictated only by the expire
value. So, the even will still trigger the notification (even if the transaction was terminated), but the t_inject_branch() should fail as (1) it does not find the transaction or (2) the found transaction does not accept new branches.
Collecting and compiling information from other reports, we narrow down the possible causes of such crashes to a race condition (on operating over the msg save in shmem in transaction) between"
(a) the process doing the cleanup after an async
or notify_on_event
call
and
(b) another process building a fake request
due running a failure route or inject_branch
Hello guys! I use separate opensips for work this softphones on mobile platforms Android and Ios. Sometimes opensips is crashes, in logs i see:
I have trace and core. I can send core to you in private channel.
opensips -V:
trace-mobile-proxy.txt
Thank you!