drachtio / drachtio-freeswitch-modules

A collection of open-sourced freeswitch modules that I use in various drachtio applications
MIT License
172 stars 119 forks source link

Some mod_dialogflow sessions are not freed #58

Closed drwatson32 closed 3 years ago

drwatson32 commented 3 years ago

Dear Dave, first of all thanks for your awesome software, but could you please help me shed some light on the following issue: My test setup: drachtio-dialogflow-phone-gateway@master, drachtio-server@master and drachtio-freeswitch-base@freeswitch-v1.10.1-full all running in dockers on debian 10

Under some load (1-2 CPS, 100+ calls per hour) I see leaked calls in fsmrf (using fs_cli -x "show calls"). They cannot be killed btw (says no channel for this uuid).

The difference between good and bad calls are lines after "google_dialogflow_session_cleanup: sending writesDone..".

In good case I see: 42e138f2-9583-11eb-aab6-d94b44fe44fd 2021-04-04 20:21:55.620216 [INFO] google_glue.cpp:428 google_dialogflow_session_cleanup: waiting for read thread to complete 42e138f2-9583-11eb-aab6-d94b44fe44fd 2021-04-04 20:21:55.620216 [INFO] google_glue.cpp:431 google_dialogflow_session_cleanup: read thread completed 42e138f2-9583-11eb-aab6-d94b44fe44fd 2021-04-04 20:21:55.620216 [INFO] google_glue.cpp:439 google_dialogflow_session_cleanup: Closed google session ... 42e138f2-9583-11eb-aab6-d94b44fe44fd 2021-04-04 20:21:55.620216 [DEBUG] switch_core_media_bug.c:1289 Removing BUG from sofia/drachtio_mrf/nobody@drachtio:5060 42e138f2-9583-11eb-aab6-d94b44fe44fd 2021-04-04 20:21:55.620216 [DEBUG] switch_core_state_machine.c:848 (sofia/drachtio_mrf/nobody@drachtio:5060) Callstate Change ACTIVE -> HANGUP 42e138f2-9583-11eb-aab6-d94b44fe44fd 2021-04-04 20:21:55.620216 [DEBUG] switch_core_state_machine.c:850 (sofia/drachtio_mrf/nobody@drachtio:5060) State HANGUP 42e138f2-9583-11eb-aab6-d94b44fe44fd 2021-04-04 20:21:55.620216 [DEBUG] mod_sofia.c:460 Channel sofia/drachtio_mrf/nobody@drachtio:5060 hanging up, cause: NORMAL_CLEARING 42e138f2-9583-11eb-aab6-d94b44fe44fd 2021-04-04 20:21:55.620216 [DEBUG] mod_sofia.c:514 Sending BYE to sofia/drachtio_mrf/nobody@drachtio:5060

but in bad case I don't see any lines after "google_dialogflow_session_cleanup: sending writesDone.."

And could you please take a look at https://github.com/drachtio/drachtio-freeswitch-modules/commit/d0f10ffff8e32a998c04b164235da3e4fb8c6a35#diff-a28b7b42e8e1c7eaccc685c261700be5fe64e57ef9a67e49dea236ed9eb1653d.

Is it accidental commit to return SWITCH_STATUS_SUCCESS on hanguphook or not (because commit is related to AWS)? Maybe I have to use v0.2.9 release instead of master?

davehorton commented 3 years ago

would it be possible to send me a full freeswitch log -- with sip tracing on, and at DEBUG level like you have above -- which contains at least one example each of a "good" and a "bad" call? I would like to be able to follow the full call trace. Its ok if there are a lot of other calls in there as well, as long as you can give me the session id for the good and bad calls. Feel free to send to me at daveh@drachtio.org rather than posting publicly if you like

drwatson32 commented 3 years ago

Thanks, Dave. Sure, I will send logs to email. Currently the system has few calls, but I expect next peak in 8 hours or so.

Btw have you ever seen something like this under load?:

[ 686.666867] TCP: request_sock_TCP: Possible SYN flooding on port 8021. Sending cookies. Check SNMP counters. [ 8706.747294] traps: freeswitch[1637] general protection ip:7fc76ad4c529 sp:7fc70c488980 error:0 in libc-2.24.so[7fc76ad18000+195000] [53975.197840] hrtimer: interrupt took 1212768 ns [68482.385664] traps: freeswitch[8181] general protection ip:7f5edfdb8529 sp:7f5e6acae980 error:0 in libc-2.24.so[7f5edfd84000+195000] [88798.891334] traps: freeswitch[13537] general protection ip:7f63684cc529 sp:7f62c42fe980 error:0 in libc-2.24.so[7f6368498000+195000] [92671.573695] freeswitch[18456]: segfault at 48 ip 00007fd3545d5666 sp 00007fd2ddc40770 error 4 in mod_dialogflow.so[7fd3545cc000+1b000] [92671.576732] Code: f9 ff ff 48 89 85 10 f9 ff ff 48 83 c0 10 48 89 85 28 f9 ff ff 0f 1f 80 00 00 00 00 48 8b 85 20 f9 ff ff 48 8b b5 58 f9 ff ff <48> 8b 40 48 48 8d 78 10 48 8b 40 10 ff 50 18 84 c0 88 85 30 f9 ff

Is any recommendation for setup at 2021? I.e. debian 9, no docker, drachtio-freeswitch-base@v1.10, drachtio-freeswitch-modules@v0.2.9, drachtio-server@0.8.8 and drachtio-dialogflow-phone-gateway@master without docker?

davehorton commented 3 years ago

I've not seen the error about SYN flood before. I would make sure outside traffic cant reach freeswitch port 8021.

I much prefer to run on VMs rather than inside docker, myself. I build freeswitch 1.10.5 on debian 10 for my own purposes:

drwatson32 commented 3 years ago

Thanks, Dave, I will try convert suggested playbook. Btw I've send logs to your mail, since they contain sensitive data.

drwatson32 commented 3 years ago

It was related to https://github.com/drachtio/drachtio-dialogflow-phone-gateway, I slightly rewrited it to close sessions in different hangup cases, and now problem is gone

davehorton commented 3 years ago

would you care to make a PR for your changes to that app?

drwatson32 commented 3 years ago

sure, will do it in repo for the app