signalwire / freeswitch

FreeSWITCH is a Software Defined Telecom Stack enabling the digital transformation from proprietary telecom switches to a versatile software implementation that runs on any commodity hardware. From a Raspberry PI to a multi-core server, FreeSWITCH can unlock the telecommunications potential of any device.
https://freeswitch.com/#getting-started
Other
3.36k stars 1.38k forks source link

mod_callcenter freezes #1062

Open altair86 opened 3 years ago

altair86 commented 3 years ago

Describe the bug mod_callcenter freezes

To Reproduce Using:

5 operators online and 5-20 calls per minute.

Expected behavior I never met this behavior while the call center was working with rtmp + Flash

Package version or git hash I'm using right now.: FreeSWITCH Version 1.10.5-release+git~20200818T185121Z~25569c1631~64bit (git 25569c1 2020-08-18 18:51:21Z 64bit) But this error was also present on 1.8.7 and 1.9.0.

Trace logs I have 2 incidets on 1.10.5. First Last status of member is:

No any [ERR] or [CRIT] in log file.

Normal behavior:

When mod_callcenter is frozen:

And there is no action in the member table until the entire freeswitch is reloaded. reload mod_callcenter say that mod_callcenter is busy

reload -f mod_callcenter does not reload mod_callcenter, nothing happens, no effect except that the cursor carriage is shifted to the line below

I think that at the moment of trying to connect member to agent, it breaks (originate/bridge) in mod_callcenter and mod_callcenter freezes. Only rebooting the entire freeswitch helps.Why can this happen?

There is no "normal" way to catch this event. It just happens and that's it. The system administrator responds only to complaints from operators that calls have disappeared somewhere.

mjerris commented 3 years ago

compare debug logs when it freezes to normal debug logs for call

altair86 commented 3 years ago

compare debug logs when it freezes to normal debug logs for call

I have DEBUG log enabled, but it doesn't show anything like I said above

Last status from mod_callcenter: 2021-02-02 23:02:12.065179 [DEBUG] mod_callcenter.c:1180 Updated Agent operator1@company.com set state = Receiving

trdenton commented 3 years ago

@altair86

say that mod_callcenter is busy

could you provide the exact message?

a core file would be helpful, if you could generate one the next time it hangs: https://freeswitch.org/confluence/display/FREESWITCH/Debugging#Debugging-CollectingDebugInformationWhileFreeSWITCHIsRunning(Linux/Unix)

altair86 commented 3 years ago

@trdenton

sudo /usr/local/freeswitch/bin/fs_cli -x "reload mod_callcenter" +OK Reloading XML -ERR unloading module [Module in use.] -ERR loading module [Module already loaded]

BUT:

sudo /usr/local/freeswitch/bin/fs_cli -x "reload -f mod_callcenter" |

Got nothing.

OK, thank you. Next time I collect core file.

Right now Freeswitch uptime 13 days.

altair86 commented 3 years ago

Hello. It happened again. (Freeswitch lasted uptime 33 days)

  1. This time I collect the core.trace1 dump.
  2. Then restarted Freeswitch. But it didn't help.
  3. Then I cleared the table members: delete from public.members
  4. Then collect the core.trace2 dump.
  5. Restarted Freeswitch and everything started working fine
tongxiao2013 commented 3 years ago

Hi, altair86 & crienzo, Your description is pretty clear, but I still want a little more information to dig into this issue.

First, I want to know when you say "mod_callcenter freeze", you mean: -- mod_callcenter totally does not work OR, -- except operator1 & operator2 (you mentioned above), other 3 agents works fine?

Second, I need below information (about 4mins fs log, from -2 min to +2 min): 1 Debug level log 2 SIP trace. you can turn it on with: >sofia global sip trace on 3 when frozen happens, the status of the member/agent/tiers in DB-table. -- select name,domain,type, contact, status, state, no_answer_count, calls_answered from agents; -- select from members; -- select from tiers;

altair86 commented 3 years ago

@tongxiao2013 Hi I can complete my message only when it happened again.

Yes, Its totaly dont work, because:

That because agents can't continue to work.

  1. debug is on
  2. ok
  3. ok
tongxiao2013 commented 3 years ago

Sure, will wait for it happens~