Closed GoogleCodeExporter closed 9 years ago
starting regression tests, will take some day until I got the change that
causes this. But by looking at all the diffs between r500 and r510, I guess it
has to be something in the MultiThreadedConnector changes in r510.
Original comment by daniel.e...@gmail.com
on 4 May 2013 at 7:03
While testing r509 I found out, that r510 does not reconnect, while r509 does.
I started streaming, then shut down icecast, wait some seconds and fired
icecast back up. Result:
r509:
04-May-2013 09:09:20 Exception caught in BufferedSink :: write3
04-May-2013 09:09:20 MultiThreadedConnector :: sinkThread reconnecting 0
04-May-2013 09:09:20 couldn't write all from encoder to underlying sink 1210
04-May-2013 09:09:21 MultiThreadedConnector :: sinkThread reconnecting 0
[...]
04-May-2013 09:09:27 MultiThreadedConnector :: sinkThread reconnecting 0
04-May-2013 09:09:28 HTTP/1.0 200
and I'm back on air.
r510:
04-May-2013 08:58:00 Exception caught in BufferedSink :: write3
04-May-2013 08:58:00 MultiThreadedConnector :: sinkThread reconnecting 0
04-May-2013 08:58:00 couldn't write all from encoder to underlying sink 970
and there it sits and looks at the server with oogly eyes and does not
reconnect or exit.
Again pointing to the MultiThreadedConnector changes.
Original comment by daniel.e...@gmail.com
on 4 May 2013 at 8:05
I have built r510 without the MultiThreadedConnector changes. It now runs since
4 hours without any issue. It does not lose connection without any reason and
it reconnects in case there really is a reason for a connection loss (like
icecast restart).
Attached is a patch for r510 that reverts only the MultiThreadedConnector
related changes.
Original comment by daniel.e...@gmail.com
on 4 May 2013 at 8:57
Attachments:
Regression applied in r514. Thanks!
We are investigating the deadlock issue.
Original comment by rafael2k...@gmail.com
on 14 May 2013 at 3:16
Yep The MultiThreadedConnector had a problem.
I tested the algorithm in a separate program and a race could happen.
Moving 1 pthread_lock() fixes the issue.
We should hold a lock on 'mutex_done' before letting the consumers run
Line 262
pthread_cond_broadcast(&cond_start); // kick the waiting consumers to look again
pthread_mutex_lock(&mutex_done); // LOCK early to prevent missing a
condition 'done' variable change
pthread_mutex_unlock(&mutex_start); // UNLOCK, release the consumers' cond
variable, now they can run
Original comment by oetelaar.automatisering
on 14 May 2013 at 9:39
There is actually another problem with this.
The consumer threads might not be listening for the producer yet before
cond_start+mutex_start are changed.
I will try to fix this too, very soon.
Original comment by oetelaar.automatisering
on 15 May 2013 at 7:50
Original issue reported on code.google.com by
daniel.e...@gmail.com
on 3 May 2013 at 6:48