Open ohnoov opened 1 month ago
Looks like the absent locking (or more like unbalanced and broken) is in this function for the case where reg_mode
is not MID_REG_THROTTLE_AOR
.
https://github.com/OpenSIPS/opensips/blob/master/modules/mid_registrar/lookup.c#L50
There's ul.unlock_udomain
call at the end, but no corresponding call to ul.lock_udomain
before the function starts accessing usrloc records for the domain.
https://github.com/OpenSIPS/opensips/blob/master/modules/mid_registrar/lookup.c#L148C2-L148C29
I'll try adding ul.lock_udomain
somewhere in there to see if it will help.
Hm, get_ucontact_from_id
keeps the AoR locked if it finds something. So it's not this.
A few more observations:
ul_contact_event_to_msg
the purpose of which seems to be to fill a newly created SIP INVITE branch request message with values from AVPscell
passed to t_inject_branch
from which ul_contact_event_to_msg
is calledEg. in mirror mode, I perform simultaneous calls to 6 contacts with usernames of test0 - test5 and I get in the above mentioned functions this output from a particular worker:
tm:t_inject_branch: t_inject_branch:
callid=Call-ID: e0bb98dcf3fdea880a59aad80
to=To: <sip:test4@127.0.0.1:5067;ctid=3518437208893134>
tm:ul_contact_event_to_msg: injecting new branch:
uri=<sip:test5@127.0.0.1:7805;+sip.pnsreg>, received=<sip:127.0.0.1:7805>,path=<>, qval=-1, socket=<udp:127.0.0.1:7676>, bflags...
In t_inject_branch
I added a log statement for t->to
. I'd expect the username in t->to
to match the username in R-URI selected for the new branch.
So somehow, sometimes when using multiple workers, wrong list of AVPs are selected when t_inject_branch
is called.
Allright, this patch fixes the issue. The cause is described in the patch.
https://megous.com/dl/tmp/0001-Prevent-overlapping-modifications-of-pn_ebr_filters-.patch
There are other options, like allocating the template per worker, etc.
I'll not make pull request because github would force me into 2FA, which I dislike, so please either use the patch above, or the above suggestion.
Any updates here? No progress has been made in the last 15 days, marking as stale. Will close this issue if no further updates are made in the next 30 days.
.
Any updates here? No progress has been made in the last 15 days, marking as stale. Will close this issue if no further updates are made in the next 30 days.
Yes, the analysis and a proposed patch is available above. This issue is not stale.
OpenSIPS version you are running
Describe the bug
To Reproduce
I don't think so. It's a complex setup that requires a mobile application, some push notification setup, etc. Even I can't reproduce it simply and reliably, because it requires re-REGISTER to come at the same time from multiple devices.
I'll try to eye-ball the code, but I guess it will be some lack of or incorrect locking over access to shared data (maybe usrloc).
Expected behavior
All INVITEs being sent to proper contacts.
OS/environment information
Probably irrelevant, but Debian 12 + manual opensips build.