OpenSIPS is a GPL implementation of a multi-functionality SIP Server that targets to deliver a high-level technical solution (performance, security and quality) to be used in professional SIP server platforms.
Hello collegues! Sometimes opensips not correctly execute the procedure "www_authorize()". Sometimes different workers locks and have collision with other workers.
What i mean, i have this code:
I use "log_level=4" in global parameters. In logs i see these:
1) Collision between workers. Collisions manifested in queries to the database and responses:
Feb 6 02:02:07 reg-2 /usr/sbin/opensips[10348]: DBG:db_postgres:db_postgres_submit_query: 0x7f58fe5a9cb0 PQsendQuery(select ha1,id1,id2 from subscriber where username='admin' AND domain='domain.com')
Feb 6 02:02:07 reg-2 /usr/sbin/opensips[10347]: DBG:db_postgres:db_postgres_store_result: 0x7f58fe5a9cb0 PQresultStatus(PGRES_TUPLES_OK) PQgetResult(0x1b94ed0)
Feb 6 02:02:07 reg-2 /usr/sbin/opensips[10347]: DBG:db_postgres:db_postgres_convert_rows: PQgetvalue(0x7f58fe5a9cb0,0,0)=[hash]
Feb 6 02:02:07 reg-2 /usr/sbin/opensips[10347]: DBG:db_postgres:db_postgres_convert_rows: PQgetvalue(0x7f58fe5a9cb0,0,1)=[xxy]
Feb 6 02:02:07 reg-2 /usr/sbin/opensips[10347]: DBG:db_postgres:db_postgres_convert_rows: PQgetvalue(0x7f58fe5a9cb0,0,2)=[yyy]
Here you may see that with 0x7f58fe5a9cb0 works different workers, 10348 send request to DB and 10347 get response from DB.
2) About worker 10347. Here logs:
Feb 6 02:01:14 reg-2 /usr/sbin/opensips[10347]: [MAIN] Incoming request (M=REGISTER from 10.10.10.207:5060:udp)#010REGISTER sip:1.1.1.40 SIP/2.0#013#010Via: SIP/2.0/UDP 10.10.10.207:5060;branch=z9hG4bKf121.6609c175.0#013#010Via: SIP/2.0/UDP 100.120.136.231:55835;rport=6033;received=5.62.41.36;branch=z9hG4bK21619082#013#010Max-Forwards: 69#013#010From: <sip:35043@1.1.1.40>;tag=110056165#013#010To: <sip:35043@1.1.1.40>#013#010Call-ID: 625861171-1983586169-1178258341#013#010CSeq: 2 REGISTER#013#010Contact: <sip:35043@100.120.136.231:55835>#013#010Content-Length: 0#013#010Authorization: Digest username="35043",uri="sip:1.1.1.40",algorithm=MD5,realm="1.1.1.40",nonce="nonce",response="response",qop=auth,nc=00000001, cnonce="cnonce"#013#010User-Agent: pplsip#013#010Path: <sip:10.10.10.207;lr;received=sip:5.62.41.36:6033;x-sock=udp:1.1.1.40:5060>
Feb 6 02:01:14 reg-2 /usr/sbin/opensips[10347]: DEBUG-1(M=REGISTER F=sip:35043@1.1.1.40 T=sip:35043@1.1.1.40 ID=625861171-1983586169-1178258341)
Feb 6 02:02:07 reg-2 /usr/sbin/opensips[10347]: DBG:core:db_new_result: allocate 48 bytes for result set at 0x7f58fe5ae7d8
Feb 6 02:02:07 reg-2 /usr/sbin/opensips[10347]: DBG:db_postgres:db_postgres_store_result: 0x7f58fe5a9cb0 PQresultStatus(PGRES_TUPLES_OK) PQgetResult(0x1b94ed0)
Feb 6 02:02:07 reg-2 /usr/sbin/opensips[10347]: DBG:db_postgres:db_postgres_get_columns: 3 columns returned from the query
Feb 6 02:02:07 reg-2 /usr/sbin/opensips[10347]: DBG:core:db_allocate_columns: allocate 84 bytes for result columns at 0x7f58fe5af5c8
Feb 6 02:02:07 reg-2 /usr/sbin/opensips[10347]: DBG:db_postgres:db_postgres_convert_rows: PQgetvalue(0x7f58fe5a9cb0,0,0)=[hash]
Feb 6 02:02:07 reg-2 /usr/sbin/opensips[10347]: DBG:db_postgres:db_postgres_convert_rows: PQgetvalue(0x7f58fe5a9cb0,0,1)=[xxy]
Feb 6 02:02:07 reg-2 /usr/sbin/opensips[10347]: DBG:db_postgres:db_postgres_convert_rows: PQgetvalue(0x7f58fe5a9cb0,0,2)=[yyy]
Here you may see that worker 10347 received REGISTER in 02:01:14, print message "DEBUG-1" (because request got into route AUTHORIZE) and was blocked for a ~50 seconds. After unloking 10347 got answer from DB for another worker (10348) and continue working.
3) About worker 10348. Here logs:
Feb 6 02:02:07 sip-reg-test-2 /usr/sbin/opensips[10348]: [MAIN] Incoming request (M=REGISTER from 10.10.10.207:5060:udp)#010REGISTER sip:domain.com SIP/2.0#013#010Via: SIP/2.0/UDP 10.10.10.207:5060;branch=z9hG4bKa961.cd044656.0#013#010Via: SIP/2.0/UDP 2.2.2.255:8935;received=2.2.2.255;rport=8935;branch=z9hG4bKPj5923790eabc646378ccdec82acb21f65#013#010Route: <sip:domain.com;lr>#013#010Max-Forwards: 69#013#010From: <sip:admin@domain.com>;tag=8bcc90f5a02242a2954d89d7bcdb701f#013#010To: <sip:admin@domain.com>#013#010Call-ID: bdcc7b2cfb734f9e89d775345273329c#013#010CSeq: 18512 REGISTER#013#010User-Agent: TEST-Office#013#010Contact: <sip:admin@2.2.2.255:8935;ob>#013#010Expires: 30#013#010Allow: PRACK, INVITE, ACK, BYE, CANCEL, UPDATE, INFO, SUBSCRIBE, NOTIFY, REFER, MESSAGE, OPTIONS#013#010Authorization: Digest username="admin", realm="domain.com", nonce="5c5a160d00006c5d4ebf875c8ddb5468bb0ae0f6dfccd581", uri="sip:domain.com", response="resonse", cnonce="cnonce", qop=auth, nc=00000001#013#010Content-Length: 0#013#010Path: <sip:10.10.10.207;lr;received=sip:2.2.2.255:8935;x-sock=udp:1.1.1.40>
Feb 6 02:02:07 sip-reg-test-2 /usr/sbin/opensips[10348]: DBG:core:parse_headers: flags=78
...
Feb 6 02:02:07 sip-reg-test-2 /usr/sbin/opensips[10348]: DBG:auth:check_nonce: [NONCE] and [NONCE]
Feb 6 02:02:07 sip-reg-test-2 /usr/sbin/opensips[10348]: DBG:db_postgres:db_postgres_val2str: PQescapeStringConn: in: 15 chars, out: 15 chars
Feb 6 02:02:07 sip-reg-test-2 /usr/sbin/opensips[10348]: DBG:db_postgres:db_postgres_val2str: PQescapeStringConn: in: 29 chars, out: 29 chars
Feb 6 02:02:07 sip-reg-test-2 /usr/sbin/opensips[10348]: DBG:db_postgres:db_postgres_submit_query: 0x7f58fe5a9cb0 PQsendQuery(select ha1,abonent_id,product_id from subscriber where username='admin' AND domain='domain.com')
Feb 6 02:02:14 sip-reg-test-2 /usr/sbin/opensips[10348]: DEBUG-AUTH-ERROR-03 avp(product_id)=<null> avp(abonent_id)=<null> (M=REGISTER F=sip:admin@domain.com T=sip:admin@domain.com ID=bdcc7b2cfb734f9e89d775345273329c)
Feb 6 02:02:14 sip-reg-test-2 /usr/sbin/opensips[10348]: [AUTHORIZE] Result: invalid user (M=REGISTER F=sip:admin@domain.com T=sip:admin@domain.com ID=bdcc7b2cfb734f9e89d775345273329c)
Here you may see that worker 10348 send query to DB, but response intercepted another worker (10347).
Questions:
1) Why 10347 is locks in www_authorize()?
2) Why 10347 intercepted resonse from DB for another worker?
After restarting opensips the problem disappears for a some time, but when starts reproduced again.
Please, help with investigating this problem. Thank you!
@ivanoff946 , to me it looks like something funky at DB level - are you connecting directly into a postgres server ? have you tried to use different DB engine (sqlite or mysql) ?
Hello collegues! Sometimes opensips not correctly execute the procedure "www_authorize()". Sometimes different workers locks and have collision with other workers. What i mean, i have this code:
I use "log_level=4" in global parameters. In logs i see these: 1) Collision between workers. Collisions manifested in queries to the database and responses:
Here you may see that with 0x7f58fe5a9cb0 works different workers, 10348 send request to DB and 10347 get response from DB. 2) About worker 10347. Here logs:
Here you may see that worker 10347 received REGISTER in 02:01:14, print message "DEBUG-1" (because request got into route AUTHORIZE) and was blocked for a ~50 seconds. After unloking 10347 got answer from DB for another worker (10348) and continue working.
3) About worker 10348. Here logs:
Here you may see that worker 10348 send query to DB, but response intercepted another worker (10347).
Modules:
opensips -V:
Questions: 1) Why 10347 is locks in www_authorize()? 2) Why 10347 intercepted resonse from DB for another worker? After restarting opensips the problem disappears for a some time, but when starts reproduced again. Please, help with investigating this problem. Thank you!