skyscreamer / nevado

A JMS driver for Amazon SQS.
http://nevado.skyscreamer.org/
Apache License 2.0
51 stars 48 forks source link

Bug fix: Consumer thread called join() on itself causing deadlock #78

Closed jesper2610 closed 10 years ago

jesper2610 commented 10 years ago

When you are using Nevado with Spring, there is a a message listener called SimpleMessageListenerContainer which is invoked if an exception is thrown while trying to receive new messages. In an attempt to recover from whatever caused the error, it tries to close down the consumer and then recreate it.

It ends up calling the method stop() in AsyncConsumerRunner and this is where the bug is. It calls runner.join() to join the consumer thread with the current thread but in this case they are the same thread. The result is that the thread gets hung and never recovers from this condition. That also means Spring never gets a chance to recreate the consumer since that is done afterwards and the thread that's supposed to do it is waiting forever. It's like a deadlock except it involves only one thread instead of two.

In short, if there is any kind of error receiving messages, it will stop listening for messages and the only way to recover is to restart the application server.

Here is the thread dump taken at the moment where Spring Framework calls Nevado to stop the queue consumer. As you can see, it start in the runner thread and ends up calling stop() in the same thread.

Thread-4@10553 daemon, prio=5, in group 'main', status: 'RUNNING' at org.skyscreamer.nevado.jms.AsyncConsumerRunner.stop(AsyncConsumerRunner.java:111) at org.skyscreamer.nevado.jms.NevadoSession.stop(NevadoSession.java:582) at org.skyscreamer.nevado.jms.NevadoConnection.stop(NevadoConnection.java:109) at org.skyscreamer.nevado.jms.NevadoConnection.close(NevadoConnection.java:118) at org.springframework.jms.connection.ConnectionFactoryUtils.releaseConnection(ConnectionFactoryUtils.java:80) at org.springframework.jms.listener.AbstractJmsListeningContainer.refreshSharedConnection(AbstractJmsListeningContainer.java:387) at org.springframework.jms.listener.SimpleMessageListenerContainer.onException(SimpleMessageListenerContainer.java:256) at org.skyscreamer.nevado.jms.AsyncConsumerRunner.processMessage(AsyncConsumerRunner.java:93) at org.skyscreamer.nevado.jms.AsyncConsumerRunner.run(AsyncConsumerRunner.java:38) at java.lang.Thread.run(Thread.java:744)