Closed GoogleCodeExporter closed 9 years ago
Thanks for the issue report. If you experienced an issue, which is more or less
reproducibly, it makes sense to report it.
Running demo synth with your modifications, I've not got Sofia's status phrase
garbled, but I observed several issue, which should be separated.
1. Concurrency and delays caused by Sleep()
You should not use Sleep() in a thread serving multiple sessions, channels.
This will
definitely not work the way you expected. However, if this helps identify
problems in
client stack, I never mind.
2. SPEAK-COMPLETE event and double mrcp_application_session_terminate() call.
Please note, mrcp_application_session_terminate() MUST be called only once,
afterward, as soon as session termination has been acknowledged by client stack,
session must be destroyed.
In fact, double mrcp_application_session_terminate() will be sent, if
SPEAK-COMPLETE
event is received, and this definitely will cause memory related issues and
crashes.
I believe you met this issue, as this is the only case, when
sofia_session->mutex can
be NULL. Probably you should do nothing on SPEAK-COMPLETE in the modified
scenario.
3. Active sessions and quit
Also you should not "quit" (destroy demo framework), while there are active
sessions
in play. I guess this could be other problem you experienced with NULL pointer
in
demo framework task.
4. Garbage in Sofia's status phrase string
I assume, this is the actual issue to reproduce. I wonder, if this will be still
reproducible to you, if you ensure point 2 and 3 never happens.
Also please indicate version of Sofia you use. I guess, it's 1.12.10
Original comment by achalo...@gmail.com
on 15 Jun 2009 at 8:03
Thanks for your answer. I thought I managed to cut and paste the whole log from
the
dos box, but apparently this was not the case. Yes, it is sophia sip 1.12.10
(with
the "improved" tport.c)
ad (1) I had some doubts about the sleeps. I was going to write about this, but
then
since you suggested yourself to do this (in
"http://unimrcp.googlegroups.com/web/
demo_synth_application.c") I continued. So I decided to make short sleeps, and
since
I knew that streaming the generated TTS speech would take considerably longer
than
those two sleeps, I supposed it would not be too bad. (I must say I initially
thought that each session would have its own tasks/threads, and that multiple
sessions would not have trouble with each other sessions' sleeps, although I
did not
dive too deep into the code and logging to figure this out. (I should add an
(optional) thread-id to the log function.)
The alternative solution would be to implement a "delayed send MRCP request,
but
conditionally to the current state", but for the purpose of my test (test my
plugin)
that was a little too much.
2. SPEAK-COMPLETE event. I'm not sure I fully understand the scope of what you
are
saying. In my test I would not expect to see a SPEAK-COMPLETE event at all.
And additionally, I was hoping they were mutually exclusive: a SPEAK-COMPLETE
should
not arrive (or at least be processed) if a stop response was received (and vice
versa).
In my case, the stop complete response should arrive well before the
SPEAK-COMPLETE
event. Indeed, if I would start a lot of sessions, the sleeps might get the
client
into trouble. In my log file I don't see a SPEAK-COMPLETE event ?
Note that there is a kind of race condition here. You cannot be sure what is
first:
the stop from the client or the speak-complete event from the server.
I don't think I see this race condition in my log.
I don't understand the double mrcp_application_session_terminate() you see in
the
log. In my logging the session pointers for different sessions (one behind the
finished/destroyed other) is the same, but the sessions are different.
I guess when I used your original pause/resume scenario I met this double
mrcp_application_session_terminate() issue, and experienced the crashes with
the
null pointer (sofia->mutex). That's exactly why I commented out the
mrcp_application_channel_remove(session,channel);
I'm not sure what my preference is - as a developer i would want my program to
crash, so that I can improve it - but if i would run this in production i would
probably prefer that a problem in one session would not have an adverse effect
on
the other sessions (or channels). It may turn out to be very difficult to
implement
a proper state machine (client) in this multi-tasked, multi-session,
multi-channel,
networked, time-constrained and real time environment. So i would prefer the
"api"
to be kind of robust against some (hopefully most) cases - even if they are
developers faults.
(3) as mentioned before, it may be hard to make a proper state machine: one may
have
to do some timed garbage collection on sessions/channels as you can never be
sure
the server terminates a session. I would prefer to be able to stop and start
the
client without a crash taking place... I know this error processing is where it
really becomes hard, and I realise you just supplied a client demo, not a
finished
application.
I commented out the block of code after the resume state (so one delay less, no
stop
message to be sent and only one
mrcp_application_channel_remove(session,channel) -
after the speak complete event) and I still see the corruption. No crashed
though.
It is occurs less frequently - though it's not so hard to be a succesful monkey.
Thanks for all your help. Perhaps more later this week...
Original comment by garmt.no...@gmail.com
on 17 Jun 2009 at 9:24
Garmt,
I assume you are on vocation, but when you'll be back, please retest the same
with
the latest release. I suspect this issue relates to Issue-29 and APR pools
thread safety.
Original comment by achalo...@gmail.com
on 23 Jul 2009 at 5:02
Any updates? Is it reproducible with 0.7 release?
Original comment by achalo...@gmail.com
on 9 Sep 2009 at 6:32
Last week I tried to reproduce the problem (without upgrading either trunk or
0.7
release or APR). I failed to reproduce the problem (maybe I used release
instead of
debug executables !?). I will give it another try soon.
Original comment by garmt.no...@gmail.com
on 10 Sep 2009 at 11:45
Thanks for the update, Garmt.
Original comment by achalo...@gmail.com
on 10 Sep 2009 at 12:57
Summarizing the discussion below
http://groups.google.com/group/unimrcp/browse_thread/thread/8d756e7f15d37f98?hl=
en
I admit, this issue indeed existed, but should be already fixed in the trunk.
Related changes are: r1155, r1170, r1135.
Original comment by achalo...@gmail.com
on 12 Oct 2009 at 5:01
Original comment by achalo...@gmail.com
on 27 Apr 2010 at 12:02
Original issue reported on code.google.com by
garmt.no...@gmail.com
on 13 Jun 2009 at 9:09Attachments: