Closed jchavanton closed 1 year ago
AFAIK, that's the nature of TCP, i.e: the message receiver should be able to detect SIP message boundaries. Please confirm and provide more info (e.g: PJSIP version, reproducing steps using pjsua app, etc) if the receiver happens to be PJSIP based.
Hi,
Notice the corruption with the ACK in the middle of the body.
INVITE sip:+12012271232@sip.sipdomain.com;transport=tcp SIP/2.0
Via: SIP/2.0/TCP 112.15.96.70:3456;rport;branch=z9hG4bKPj4tHlNqYReRoNkf6L59pEs.7ccJWxwimu;alias
Max-Forwards: 70
From: sip:+15144009500@customer-mock.xyz;tag=LvIv94oFW.UDfCtNtTKkO93W.P9LSfOX
To: sip:+12012271232@sip.sipdomain.com
Contact: <sip:+15144009500@112.15.96.70:41533;transport=TCP;ob>
Call-ID: aXi-10AOXRgvOigH98HPvYWCO2gO1ktC
CSeq: 10022 INVITE
Allow: PRACK, INVITE, ACK, BYE, CANCEL, UPDATE, INFO, SUBSCRIBE, NOTIFY, REFER, MESSAGE, OPTIONS
Supported: replaces, 100rel, timer, norefersub
Session-Expires: 1800
Min-SE: 90
Proxy-Authorization: Digest username="367613170", realm="sip.sipdomain.com", nonce="XmkUh15pE1vBV+MCRz4jgE5MCfERV7Kc", uri="sip:+12012271232@sip.sipdomain.com;transport=tcp", response="9ef2d66729d924bafa21
2057e9cb7a", cnonce="dqQzKJHQPBRdHu4NEHDYDC4xIdVvkP1M", qop=auth, nc=00000001
Content-Type: application/sdp
Content-Length: 340
v=0
o=- 3792933339 3792933339 IN IP4 112.15.96.70
s=pjmedia
b=AS:84
t=0 0
ACK sip:+12012271232@sip.sipdomain.com;transport=tcp SIP/2.0
Via: SIP/2.0/TCP 112.15.96.70:3456;rport;branch=z9hG4bKPjAdtpUlfpai5xv4n2inr0KkroIDLpS9TG;alias
Max-Forwards: 70
From: sip:+15144009500@customer-mock.xyz;tag=LvIv94oFW.UDfCtNtTKkO93W.P9LSfOX
To: sip:+12012271232@sip.sipdomain.com;tag=bf8638324618dc61059d4c604476fea1.bb6d
Call-ID: aXi-10AOXRgvOigH98HPvYWCO2gO1ktC
CSeq: 10021 ACK
Content-Length: 0
I tested tested latest master pjproject as well some older versions, I will test further back to isolate potential sources of the problem. I can not reproduce on every computer, not sure if it could be related to a timing issue.
I am testing register, and calling itself, I will test only calling out to keep reproducibility as simple as possible.
I understand, it may be challenging for you reproduce as it may also be caused by the integration in Voip_Patrol, however the challenge re-invite is handle automatically but pjsua.
I will find some time to dig it further, maybe debugging it.
I wanted to have a second opinion in case there could be an obvious explanation that I could be missing.
Thank you
I wonder if it could be related to the way I am using the transports, creating them UDP/TCP/TLS and selecting the one explicitly in the account before making a call, maybe I should let pjsua do it automatically ...
However pjsua logs are not reporting anything wrong as far as I can tell ...
Example: When making a call I leave the trasport selection to pjsua
call->makeCall("sip:"+callee+";transport=tcp", prm, to_uri);
When registering to a different transport, I am forcing the transport using account modify
if (transport == "tcp") {
acc_cfg.sipConfig.transportId = config->transport_id_tcp;
...
acc->modify(acc_cfg);
Also I assuming that transports can be shared between accounts. I think I double checked in the default app.
I did test very old versions and I can still reproduce on some new hosts, could some library update, I will dig it further.
register on one account overlapping a makeCall on another account using the same transport.
[06:02:58.382][INFO] createAccount: [1][sip:user1@sip.test-domain.com]
[06:02:58.381][INFO] do_register >> sip:user1@sip.test-domain.com
[06:02:58.382][INFO] createAccount: [2][sip:user2@customer-mock.xyz;transport=tcp]
[06:02:58.382][INFO] makeCall Fast-Start: flag:4 PJSUA_CALL_NO_SDP_OFFER:8
[06:02:58.402][INFO] onCallState: [sip:user2@customer-mock.xyz;transport=tcp]
[06:02:58.402][INFO] onCallState: [0]role[CALLER]id[TU6Eo8NayvRwsaLQMoSZO8eUKshzEr4V][sip:user2@customer-mock.xyz;transport=tcp][sip:+14203011035@sip.test-domain.com;transport=tcp][CALLING|1]
[06:02:58.402][INFO] onCallTsxState: [0][sip:+14503001085@sip.test-domain.com;transport=tcp][CALLING]id[TU6Eo8NayvRwsaLQMoSZO8eUKshzEr4V] call[0] reason[]
[06:02:58.402][INFO] do_wait duration_ms:0 complete all tests:1
[06:02:58.566][INFO] onCallTsxState: [0][sip:+14503001085@sip.test-domain.com;transport=tcp][CALLING]id[TU6Eo8NayvRwsaLQMoSZO8eUKshzEr4V]
[06:02:58.679][INFO] [Register] code:200
Should I revisit the sharing transport between accounts ... I wonder why this is not safe
Could be worth nothing that the same ephemeral port is used for both the register and the makeCall even if they are on different accounts.
Have you managed to test with the latest version?
Sharing the transport should be fine and in fact it is the library's default behavior to reuse a transport whenever possible.
Another useful investigative tool here is to use packet capture to check if the packet itself already arrives in a mangled state.
Is it possible to use one socket / account ? Or do I have to create a new transport ?
@jchavanton I am facing automatic call drops issue due to NAT Issue. It sends private IP in contact header and due to that TCP Sessions are closing and changes the port and my ongoing VoIP call is dropped via BYE request automatically. Do you have any solution for this? Or Does my issue is the same as the issue you posted here?
I searched my issue and I think it is happening due to loss of ACK while use private IP. Reference Link:- https://blog.opensips.org/2017/02/22/troubleshooting-missing-ack-in-sip/
Please tell me a solution for this if you can help me.
Hi, to be related to this issue the packet/message content must be corrupted, like if a char buffer overflow/overwritten. Did you find any such evidence in a trace ?
@jchavanton I have implemented the SDK for this and using in two different apps but only facing this issue on one app only. The process for both is same though facing issue on only one app. Below I have attached the tarce log for both working app and non working app. Can you please look into this and share your thoughts on this if found any issue regarding buffer overflow/overwitten.
If it is the same issue, you will see is in a network capture, not in the pjsip logs. "facing automatic call drops issue due to NAT Issue" does not seems likely to be the same issue.
Since the issue has been quite a while, let us know if it still persists in the latest version.
Describe the bug When receiving a 407 over TCP pjsip, the ACK sent by pjsip is corrupted and mixed the next INVITE Maybe some synchronization issue is taking place.
To Reproduce Steps to reproduce the behavior: Reproduced only in voip_patrol (a test application using pjsua2)
I will troubleshoot it further, but I wanted to report it here to see if I could get any insight.
Logs/Screenshots It is always the ACK that is facing the problem, here we can see that this blog is sent in one chunk and is overlapping at least 2 messages.
pjsua logs are not reporting any problem