drachtio / drachtio-freeswitch-modules

A collection of open-sourced freeswitch modules that I use in various drachtio applications
MIT License
176 stars 119 forks source link

Some calls failing with media timeout #122

Open sachinlevel opened 1 year ago

sachinlevel commented 1 year ago

Hi Dave,

We are using this module (https://github.com/drachtio/drachtio-freeswitch-modules/blob/main/examples/google_transcribe.js) to trancribe calls, but some calls from drachtio to freeswitch are failing with media timeout, also i am not setting up this variable, so it should be using its default value, what could be possibly wrong and how can we correct it ?

it fails while connecting to FS, => ms.connectCaller(req, res)

/usr/local/freeswitch/log# grep -i "Timeout" freeswitch.log

2023-06-07 06:03:17.689310 [WARNING] sofia.c:5227 rtp-timeout-sec deprecated use media_timeout variable. 2023-06-07 06:03:17.689334 [WARNING] sofia.c:5234 rtp-hold-timeout-sec deprecated use media_hold_timeout variable. 2023-06-30 04:28:16.337857 [WARNING] sofia.c:5227 rtp-timeout-sec deprecated use media_timeout variable. 2023-06-30 04:28:16.337878 [WARNING] sofia.c:5234 rtp-hold-timeout-sec deprecated use media_hold_timeout variable.

db492d19-0caf-49df-8e99-330769961f85 2023-06-30 17:39:16.083379 [NOTICE] switch_core_media.c:2935 Hangup sofia/drachtio_mrf/nobody@X.X.X.X:5060 [CS_EXECUTE] [MEDIA_TIMEOUT] Reason: SIP;cause=604;text="MEDIA_TIMEOUT" c93129ab-bf73-49ea-832a-ee06b371de7e 2023-07-01 03:05:46.763366 [NOTICE] switch_core_media.c:2935 Hangup sofia/drachtio_mrf/nobody@X.X.X.X:5060 [CS_EXECUTE] [MEDIA_TIMEOUT] Reason: SIP;cause=604;text="MEDIA_TIMEOUT" 2fe67026-d5a4-4c4f-bb8d-42b66fb11db7 2023-07-01 03:05:46.783374 [NOTICE] switch_core_media.c:2935 Hangup sofia/drachtio_mrf/nobody@X.X.X.X:5060 [CS_EXECUTE] [MEDIA_TIMEOUT] Reason: SIP;cause=604;text="MEDIA_TIMEOUT" 69c38a74-8e67-47ce-ab33-53986d46d42f 2023-07-06 17:50:01.263340 [NOTICE] switch_core_media.c:2935 Hangup sofia/drachtio_mrf/nobody@X.X.X.X:5060 [CS_EXECUTE] [MEDIA_TIMEOUT] Reason: SIP;cause=604;text="MEDIA_TIMEOUT" 2c421ff1-a6ea-4589-8c2b-9eacbd532689 2023-07-06 17:50:01.283335 [NOTICE] switch_core_media.c:2935 Hangup sofia/drachtio_mrf/nobody@X.X.X.X:5060 [CS_EXECUTE] [MEDIA_TIMEOUT] Reason: SIP;cause=604;text="MEDIA_TIMEOUT" 05aef738-d580-42d6-ac27-ff37ee356654 2023-07-07 21:01:24.303403 [NOTICE] switch_core_media.c:2935 Hangup sofia/drachtio_mrf/nobody@X.X.X.X:5060 [CS_EXECUTE] [MEDIA_TIMEOUT] Reason: SIP;cause=604;text="MEDIA_TIMEOUT" 4606db18-b9cc-426a-9cf1-0c08324adcd8 2023-07-07 21:01:24.383350 [NOTICE] switch_core_media.c:2935 Hangup sofia/drachtio_mrf/nobody@X.X.X.X:5060 [CS_EXECUTE] [MEDIA_TIMEOUT] Reason: SIP;cause=604;text="MEDIA_TIMEOUT" 64425d08-b0ff-4a33-bfbb-ef6abec66514 2023-07-18 19:26:30.303360


Reason: SIP;cause=604;text="MEDIA_TIMEOUT"
davehorton commented 1 year ago

what version of freeswitch are you using and what linux distribution?

sachinlevel commented 1 year ago

we are using this docker file https://github.com/drachtio/docker-drachtio-freeswitch-mrf/blob/main/Dockerfile

FreeSWITCH Version 1.10.5-release+git~20200818T185121Z~25569c1631~64bit (git 25569c1 2020-08-18 18:51:21Z 64bit)

sachinlevel commented 1 year ago

on a ubuntu machine vm

davehorton commented 1 year ago

OK. media timeout from the freeswitch side means that it did not receive any RTP packets from the far end. It would be necessary to be able to see all the SIP signaling for a failed call, and possibly a network pcap as well. But generally this means that the far end was not able to send packets to the advertised freeswitch address. If this is intermittent I would check to be sure you have opened the full rtp port range on your server to allow incoming traffic to all of the udp ports that freeswitch is listening on

sachinlevel commented 1 year ago

Hi Dave, i am not an expert with freeswitch, so we are using all the default configurations, how can we check what all ports are open for rtp ? for drachtio we have 50XX tls port open. and yes this is intermittent issue with a few calls.

davehorton commented 1 year ago

you will find the rtp min and max port range in /usr/local/freeswitch/config/autoload_configs/switch.conf.xml

In the image you are using it should be 25000-39000/udp, so make sure you are allowing all traffic in to those ports

sachinlevel commented 1 year ago

So, this is what you mean by allowing traffic to this fs docker instance right ?

version: '3'

services:
  your_service_name:
    image: your_image_name:tag
    ports:
      - "25000-39000: 25000-39000/udp"
davehorton commented 1 year ago

no, I mean the server itself also needs to be configured to allow traffic in to those ports.

As far as what you have above which seems to be from a docker-compose file, it is correct as far as telling docker how to direct traffic that does make it in to those ports, but I would also comment that I have seen docker have problems and be very slow when configured to map thousands of ports like that.

sachinlevel commented 1 year ago

yeah my FS server allows the traffic to these port and docker is using network mode host, so you mean i need to check for these port allowed in VM or the source from where i am receiving calls ?

davehorton commented 1 year ago

correct

sachinlevel commented 1 year ago

m=audio 30788 RTP/SAVP 0 101\r m=audio 21608 RTP/SAVP 0 101\r m=audio 21936 RTP/SAVP 0 101\r

i can see that some of the failed calls belong to the ports which are below or above the said range, how can i update this range in my configs should i add this to vars file ? with new ports but its also failing for one port that is within the range.

<param name="rtp-start-port" value="XXXX"/>
<param name="rtp-end-port" value="YYYY"/>
davehorton commented 1 year ago

I can't be sure but I think what you reprinted above is from the SDP offered by the far end? It is not relevant what ports they are sending from.

sachinlevel commented 1 year ago

HI Dave, yes above is from SDP, i thought these are target ports, where can i get those ? got some logs from FS, but its just saying dropping media packets and then TIMEOUT.

INVITE sip:drachtio@X.X.X.X:5070 SIP/2.0
Via: SIP/2.0/UDP X.X.X.X;rport;branch=z9hG4bKS9p98pajpD28S
Max-Forwards: 70
From: <sip:X.X.X.X:5060>;tag=HeStvtD9p31XQ
To: <sip:drachtio@X.X.X.X:5070>
Call-ID: 2cc7b100-920f-123c-ed99-0242ac130002
CSeq: 69519435 INVITE
Contact: <sip:X.X.X.X:5060>
User-Agent: drachtio-fsmrf:eac88f44-56db-42e2-be5a-f0decb926e03
Content-Type: application/sdp
Content-Length: 705
X-esl-outbound: 10.138.0.128:41991

v=0
o=FreeSWITCH 1688124377 1688124378 IN IP4 Y.Y.Y.Y
s=FreeSWITCH
c=IN IP4 Y.Y.Y.Y
t=0 0
m=audio 22078 RTP/SAVP 0 101
a=rtpmap:0 PCMU/8000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=sendonly
a=crypto:1 AEAD_AES_256_GCM_8 inline:9vTmp+s5lA/geyXRFaiUpSBTxMLJmPRYVvXicsCSVqnionzq9cbktKmHXuc=
a=crypto:3 AEAD_AES_128_GCM_8 inline:Al3cx8cZHRsyBF1uwYlFfuxJ8hVwnOz0vcTP/A==
a=crypto:5 AES_256_CM_HMAC_SHA1_80 inline:iPPr7EAISxAFDFJ/zgiLQExrQDt3ank3ilh+wKtQqBhtT/lep2JDxOVm5bmDHA==
a=crypto:6 AES_192_CM_HMAC_SHA1_80 inline:pshcum3EWtkgo60uXRoa8q/VAdz5gN6VVrkiRYNt2bzMe3CzXEs=
a=crypto:7 AES_CM_128_HMAC_SHA1_80 inline:aKwvN1je+uj5cmdDr8w4vN67tRu2oFyP6U61VMpS
a=ptime:20

db492d19-0caf-49df-8e99-330769961f85 2023-06-30 17:34:15.183358 [NOTICE] switch_channel.c:1118 New Channel sofia/drachtio_mrf/nobody@X.X.X.X:5060 [db492d19-0caf-49df-8e99-330769961f85]
send 358 bytes to udp/[X.X.X.X]:5060 at 17:34:15.196634:
------------------------------------------------------------------------
SIP/2.0 100 Trying
Via: SIP/2.0/UDP X.X.X.X;rport=5060;branch=z9hG4bKS9p98pajpD28S
From: <sip:X.X.X.X:5060>;tag=HeStvtD9p31XQ
To: <sip:drachtio@X.X.X.X:5070>
Call-ID: 2cc7b100-920f-123c-ed99-0242ac130002
CSeq: 69519435 INVITE
User-Agent: FreeSWITCH-mod_sofia/1.10.5-release+git~20200818T185121Z~25569c1631~64bit
Content-Length: 0

db492d19-0caf-49df-8e99-330769961f85 2023-06-30 17:34:15.183358 [NOTICE] sofia_media.c:92 Pre-Answer sofia/drachtio_mrf/nobody@X.X.X.X:5060!
send 1115 bytes to udp/[X.X.X.X]:5060 at 17:34:15.200368:
------------------------------------------------------------------------
SIP/2.0 200 OK
Via: SIP/2.0/UDP X.X.X.X;rport=5060;branch=z9hG4bKS9p98pajpD28S
From: <sip:X.X.X.X:5060>;tag=HeStvtD9p31XQ
To: <sip:drachtio@X.X.X.X:5070>;tag=K36jDUpS7rppj
Call-ID: 2cc7b100-920f-123c-ed99-0242ac130002
CSeq: 69519435 INVITE
Contact: <sip:drachtio@X.X.X.X:5070;transport=udp>
User-Agent: FreeSWITCH-mod_sofia/1.10.5-release+git~20200818T185121Z~25569c1631~64bit
Accept: application/sdp
Allow: INVITE, ACK, BYE, CANCEL, OPTIONS, MESSAGE, INFO, UPDATE, REGISTER, REFER, NOTIFY
Supported: timer, path, replaces
Allow-Events: talk, hold, conference, refer
Content-Type: application/sdp
Content-Disposition: session
Content-Length: 334
Remote-Party-ID: "drachtio" <sip:drachtio@X.X.X.X>;party=calling;privacy=off;screen=no

v=0
o=FreeSWITCH 1688111535 1688111536 IN IP4 X.X.X.X
s=FreeSWITCH
c=IN IP4 X.X.X.X
t=0 0
m=audio 34920 RTP/SAVP 0 101
a=rtpmap:0 PCMU/8000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
a=recvonly
a=ptime:20
a=crypto:1 AEAD_AES_256_GCM_8 inline:0svnrImKjtYj5wQ5ljVlitGep5DJTo9eZjq3fHM1Z1oEDurALCJFQSzyxeM=

db492d19-0caf-49df-8e99-330769961f85 2023-06-30 17:34:15.183358 [NOTICE] mod_dptools.c:1406 Channel [sofia/drachtio_mrf/nobody@X.X.X.X:5060] has been answered

db492d19-0caf-49df-8e99-330769961f85 2023-06-30 17:34:15.183358 [NOTICE] mod_event_socket.c:452 Trying host: 10.138.0.128:41991
recv 338 bytes from udp/[X.X.X.X]:5060 at 17:34:15.203587:
------------------------------------------------------------------------
ACK sip:drachtio@X.X.X.X:5070;transport=udp SIP/2.0
Via: SIP/2.0/UDP X.X.X.X;rport;branch=z9hG4bKtjg2ajUNKprUN
Max-Forwards: 70
From: <sip:X.X.X.X:5060>;tag=HeStvtD9p31XQ
To: <sip:drachtio@X.X.X.X:5070>;tag=K36jDUpS7rppj
Call-ID: 2cc7b100-920f-123c-ed99-0242ac130002
CSeq: 69519435 ACK
Content-Length: 0

db492d19-0caf-49df-8e99-330769961f85 2023-06-30 17:35:11.563357 [ERR] audio_stream_adapter.cpp:355 (66) dropping packets!
db492d19-0caf-49df-8e99-330769961f85 2023-06-30 17:36:05.063358 [ERR] audio_stream_adapter.cpp:355 (66) dropping packets!
db492d19-0caf-49df-8e99-330769961f85 2023-06-30 17:36:59.923394 [ERR] audio_stream_adapter.cpp:355 (66) dropping packets!
db492d19-0caf-49df-8e99-330769961f85 2023-06-30 17:37:53.943431 [ERR] audio_stream_adapter.cpp:355 (66) dropping packets!
db492d19-0caf-49df-8e99-330769961f85 2023-06-30 17:38:46.883364 [ERR] audio_stream_adapter.cpp:355 (66) dropping packets!
db492d19-0caf-49df-8e99-330769961f85 2023-06-30 17:39:16.083379 [NOTICE] switch_core_media.c:2935 Hangup sofia/drachtio_mrf/nobody@X.X.X.X:5060 [CS_EXECUTE] [MEDIA_TIMEOUT]
send 575 bytes to udp/[X.X.X.X]:5060 at 17:39:16.099107:
------------------------------------------------------------------------
BYE sip:X.X.X.X:5060 SIP/2.0
Via: SIP/2.0/UDP X.X.X.X:5070;rport;branch=z9hG4bKv2Zm84y7BDvjH
Max-Forwards: 70
From: <sip:drachtio@X.X.X.X:5070>;tag=K36jDUpS7rppj
To: <sip:X.X.X.X:5060>;tag=HeStvtD9p31XQ
Call-ID: 2cc7b100-920f-123c-ed99-0242ac130002
CSeq: 69519586 BYE
User-Agent: FreeSWITCH-mod_sofia/1.10.5-release+git~20200818T185121Z~25569c1631~64bit
Allow: INVITE, ACK, BYE, CANCEL, OPTIONS, MESSAGE, INFO, UPDATE, REGISTER, REFER, NOTIFY
Supported: timer, path, replaces
Reason: SIP;cause=604;text="MEDIA_TIMEOUT"
Content-Length: 0

db492d19-0caf-49df-8e99-330769961f85 2023-06-30 17:39:16.103387 [NOTICE] switch_core_session.c:1744 Session 68 (sofia/drachtio_mrf/nobody@X.X.X.X:5060) Ended
db492d19-0caf-49df-8e99-330769961f85 2023-06-30 17:39:16.103387 [NOTICE] switch_core_session.c:1748 Close Channel sofia/drachtio_mrf/nobody@X.X.X.X:5060 [CS_DESTROY]
sachinlevel commented 1 year ago

also the error message that i get here ms.connectCaller(req, res) is Connection timeout on the node application looks like timeout waiting for esl connection after invite to FS answered from drachtio-fsmrf code base. looks like its failing in connecting to freeswitch.

davehorton commented 1 year ago

what is audio_stream_adapter.cpp ? I don't recognize it as any code in this repo

davehorton commented 1 year ago

as far as ms.connectCaller timing out, that means that freeswitch was unable to make an outbound tcp socket connection to the drachtio-fsmrf application. Can you show me a freeswitch log for that error, including the sip messages

sachinlevel commented 1 year ago

@davehorton this issue is intermittent, we are unable to reproduce it, looking for logs but is there anything we can try that could help solving this issue ? or any possible reason since 2-3 calls out of 100s are failing

sachinlevel commented 1 year ago

Logs for Freeswitch and drachtio for a faulty call

https://gist.github.com/sachinlevel/c31a9b6c8337caf8f3f36d02bedb15c7

these are for two call invites, basically for each call we get two invites one for customer and other for agent, in this call the Agent leg failed, From: "Agent" sip:Agent@FAR_END_IP;tag=BmjQeHH83pr0D

sachinlevel commented 1 year ago

@davehorton did you check the logs ? we are still facing this issue intermittently.

sachinlevel commented 1 year ago

@davehorton these are the required details.

Drachtio logs : https://gist.github.com/sachinlevel/c31a9b6c8337caf8f3f36d02bedb15c7 Freeswitch logs : https://gist.github.com/sachinlevel/aaca04ea2bb2d2bc04a0251474c8b009 Call-ID: 74efd8b3-a1a0-123c-f9a9-42010af06a08 that failed

Request object that failed : https://gist.github.com/sachinlevel/0a80a4e8b0b4e305cf90deadf7598de1

sachinlevel commented 1 year ago

ms.connectCaller(req, res) fails with Connection timeout, and on retrying using same ms.connectCaller(req, res) - using the same request it fails with "Response#send: final response already sent"

byoungdale commented 1 year ago

The "Connection timeout" in the application log you posted is usually because Freeswitch can't reach your Node.js/drachtio-fsmrf app. Can your freeswitch reach X-esl-outbound: VM_PRIVATE_IP:39183? If it cannot, you will get that "Connection timeout" error.

See a similar issue someone had here in a drachtio-fsmrf issue

byoungdale commented 1 year ago

But, the original issue you posted seemed to be something else that isn’t in the last logs you posted. So, there might be multiple issues going on here.

sachinlevel commented 1 year ago

@byoungdale for the issue that you mentioned for them no call is connecting, for us most of the calls are connecting roughly around 3 calls per 100 calls are failing to connect to FS. was wondering why would that happen, and this issue is intermittent , if you can guide us how can we correctly retry to connect to FS, that would be helpful since retrying same request is failing with "Response#send: final response already sent"