Closed GoogleCodeExporter closed 9 years ago
FYI this is with UniMRCP r1401.
Original comment by cmrie...@gmail.com
on 4 Mar 2010 at 4:11
I noticed the app is much more stable when I disabled RTCP. It has processed
1600
calls and 986 TTS requests with no crashes.
This reminds me of the first crash. I noticed some mpf_timer activity going on
after
a session was terminated. Perhaps something didn't get cleaned up completely?
Original comment by cmrie...@gmail.com
on 4 Mar 2010 at 7:44
Chris, thanks for the detailed analyzes.
I've not got a chance to look into this closer yet, but just wanted to suggest
the
same to disable RTCP. If it helps, then either the problem is in processing of
RTCP
messages or in timers. May I suggest to enable RTCP back, but disable RTCP
receiver
by setting rx-resolution to 0
<param name="rtcp-rx-resolution" value="0"/>
If the problem comes back, then the issue is in timers.
I'm about to finalize my changes in trunk soon. Then, I'm planning to perform
several
stress tests and will try with FS either.
Original comment by achalo...@gmail.com
on 4 Mar 2010 at 8:25
So, it just crashed again without the RTCP. I guess I try again with valgrind
running.
Original comment by cmrie...@gmail.com
on 4 Mar 2010 at 8:46
Then, the problem could be in the same timeouts discussed in Issue-72.
I've noticed your comment on Jira
> 1. The callback is not expected as the MRCP session was destroyed
If callback is being called, it means session isn't destroyed yet internally in
client stack.
Original comment by achalo...@gmail.com
on 4 Mar 2010 at 8:54
I found something fishy on the FS side... looks like a call terminates without
cleaning up the session. I believe this is the cause of the problem since my
stream->obj was allocated off of the call's pool.
Original comment by cmrie...@gmail.com
on 5 Mar 2010 at 1:04
The same with me. I've managed to reproduce it 2 times today. I made about 4000
calls
with SIPp -> FreeSWITCH -> UniMRCPServer and got the same backtrace. Analyzing
logs,
I found a few suspicious things. See below filtered out, TTS-9973 related part.
4098010 2010-03-05 16:47:48.144169 [DEBUG] mod_unimrcp.c:622 (TTS-9973) audio queue
created
4098011 2010-03-05 16:47:48.144169 [NOTICE] mrcp_client.c:549 Create MRCP Handle
0x84a82b0 [unimrcpserver-mrcp1]
4098012 2010-03-05 16:47:48.144169 [INFO] mrcp_client_session.c:142 Create Channel
0x84a82b0 <new>
4098015 2010-03-05 16:47:48.144169 [INFO] mrcp_client_session.c:398 Receive App
Request 0x84a82b0 <new> [2]
4098017 2010-03-05 16:47:48.144169 [INFO] mrcp_client.c:901 Add MRCP Handle 0x84a82b0
4098018 2010-03-05 16:47:48.144169 [DEBUG] mrcp_client_session.c:1203 Dispatch
Application Request 0x84a82b0 <new> [2]
4098019 2010-03-05 16:47:48.144169 [NOTICE] mrcp_client_session.c:718 Add Control
Channel 0x84a82b0 <new@speechsynth>
4098020 2010-03-05 16:47:48.144169 [DEBUG] mrcp_client_session.c:762 Add RTP
Termination 0x84a82b0 <new>
4098221 2010-03-05 16:47:48.184411 [DEBUG] mrcp_client_session.c:1039 On
Termination Add 0x84a82b0 <new>
4098222 2010-03-05 16:47:48.184411 [DEBUG] mrcp_client_session.c:1039 On
Termination Add 0x84a82b0 <new>
4098223 2010-03-05 16:47:48.184411 [INFO] mrcp_client_session.c:420 Send Offer
0x84a82b0 <new> [c:0 a:1 v:0]
4098364 2010-03-05 16:47:48.208263 [INFO] mrcp_client_session.c:158 Receive Answer
0x84a82b0 <0532c32ad6abe740> [c:0 a:1 v:0]
4098365 2010-03-05 16:47:48.208263 [DEBUG] mrcp_client_session.c:1160 Modify
Termination 0x84a82b0 <0532c32ad6abe740>
Pay attention to the delay here. You may see the same timed out request, but
the big
question is what was the reason.
4115033 2010-03-05 16:47:58.152993 [ERR] mod_unimrcp.c:963 (TTS-9973) Timed out
waiting for channel to be ready
4116285 2010-03-05 16:47:58.328997 [DEBUG] mrcp_client_session.c:1043 On
Termination Modify 0x84a82b0 <0532c32ad6abe740>
The expected "On Termination Modify" was eventually received. I suspect the MPF
callback was blocked from mod_unimrcp context, at least I see that all the other
threads of client stack were busy processing other messages.
4116286 2010-03-05 16:47:58.328997 [INFO] mrcp_client_session.c:459 Raise App
Response 0x84a82b0 <0532c32ad6abe740> [2] SUCCESS [0]
4116287 2010-03-05 16:47:58.328997 [DEBUG] mod_unimrcp.c:1752 (TTS-9973)
SYNTHESIZER channel is ready, codec = LPCM, sample rate = 8000
4116288 2010-03-05 16:47:58.328997 [DEBUG] mod_unimrcp.c:1469 (TTS-9973) CLOSED ==>
READY
BTW, I have been doing a few other tests in parallel now and haven't seen this
issue
on other products yet (not tried Asterisk though). So I suppose it's FS related.
Hope this helps a bit.
Original comment by achalo...@gmail.com
on 5 Mar 2010 at 1:39
I have a fix to the FS core that I'm going to try today. Found some cases
where a FS
call can terminate without cleaning up its mod_unimrcp session.
Original comment by cmrie...@gmail.com
on 5 Mar 2010 at 2:05
I'm going to make a change to mod_unimrcp so that it does not timeout when
destroying
the session. It will be better to let it stay stuck. At least I can attach to
the
process then and see what is going on.
Plus, we are now seeing how bad things can get if I let freeswitch destroy its
session without cleaning up the MRCP session.
Original comment by cmrie...@gmail.com
on 5 Mar 2010 at 3:52
Agreed, as it's indeed a significant error condition.
Original comment by achalo...@gmail.com
on 5 Mar 2010 at 4:27
Hello Arsen,
Tony checked in a fix to the FS core to prevent my crash and I checked in a fix
to
mod_unimrcp to no longer allow timeout when terminating the session.
Could you send me a copy of your test setup? SIPp script + command line args,
FS
dialplan with TTS request, and UniMRCP server config. I'll try to reproduce in
the lab.
Original comment by cmrie...@gmail.com
on 9 Mar 2010 at 7:00
Hello Chris,
I'm glad the issue has been fixed and will try to re-test it on my end during
the days.
At the moment, I have no access to the testbed to provide you with the exact
setup.
Perhaps, there was nothing extraordinary. As far as I remember, I tried 5 calls
per
sec with 3 sec call duration and typical FS TTS dialplan.
Original comment by achalo...@gmail.com
on 9 Mar 2010 at 8:36
Just to let you know that I have also implemented Chris' fix into app_unimrcp.
Original comment by thirion...@gmail.com
on 10 Mar 2010 at 8:11
FYI,
I have made 50000 calls with FreeSWITCH trunk (r16966) today ... no more
crashes! It
crashed on about 4000 calls before.
./sipp -r 5 -rp 1000 -d 2500
<extension name="unispeak">
<condition field="destination_number" expression="^7004$">
<action application="answer"/>
<action application="set" data="tts_engine=unimrcp"/>
<action application="speak" data="Hello world from FreeSWITCH"/>
</condition>
</extension>
The same is true for Asterisk. I have tried the latest app_unimrcp (r1579) and
reached the same 50000 calls without any noticeable issues using MRCPSynth.
I'll try to do more tests and document setup when possible.
Original comment by achalo...@gmail.com
on 11 Mar 2010 at 2:17
Chris, I have added you as a committer to the project.
http://code.google.com/p/unimrcp/people/list
So you have privileged access to svn, wiki, tracker, etc. This would allow us to
collaborate more efficiency, I guess.
Glad to work with you!
Arsen
Original comment by achalo...@gmail.com
on 11 Mar 2010 at 2:26
Thanks for the vote of confidence, Arsen. :)
I am currently ramping up the load on my production server and I haven't seen
any
issues since the last round of fixes, either. Once FreeSWITCH finishes their
1.0.5
release, I'll resume work on getting mod_unimrcp up to date with the latest rev
of
UniMRCP.
Original comment by cmrie...@gmail.com
on 12 Mar 2010 at 2:01
Well and thanks for the update, Chris.
So I consider the issue has been fixed.
Original comment by achalo...@gmail.com
on 12 Mar 2010 at 12:39
Original comment by thirion...@gmail.com
on 13 Dec 2010 at 12:29
Original issue reported on code.google.com by
cmrie...@gmail.com
on 4 Mar 2010 at 4:04