mqlight / java-mqlight

This Java package provides the high-level API by which you can interact with the MQ Light runtime.
Apache License 2.0
10 stars 9 forks source link

NullPointerException in client during broker restart #3

Closed leachbj closed 9 years ago

leachbj commented 9 years ago

On restarting the broker a NPE was thrown by the client. Once the service was restarted the client seemed to be blocked draining. Restarting the broker a second time doesn't seem to help. I was not able to reproduce this failure so hopefully the exception gives some indication of what may have gone wrong.

06/16 01:19:00.482 INFO [oopGroup-1024-8] c.g.o.s.StateMachine - Firing NETWORK_ERROR 06/16 01:19:00.486 INFO [oopGroup-1024-7] c.g.o.s.StateMachine - Firing NETWORK_ERROR 06/16 01:19:00.495 INFO [oopGroup-1024-7] c.g.o.s.StateMachine - Firing EP_RESP_OK 06/16 01:19:00.496 DEBUG[lt-dispatcher-4] c.m.m.m.p.a.i.IbmAsyncConsumer - Consumer lost connection to broker, retrying 06/16 01:19:00.497 INFO [oopGroup-1024-8] c.g.o.s.StateMachine - Firing EP_RESP_OK 06/16 01:19:00.506 DEBUG[lt-dispatcher-4] c.m.m.m.p.a.i.IbmAsyncProducer - Producer lost connection to broker, retrying 06/16 01:19:00.536 WARN [oopGroup-1024-6] i.n.u.c.DefaultPromise - An exception was thrown by com.ibm.mqlight.api.impl.network.NettyNetworkService$ConnectListener.operationComplete() java.lang.NullPointerException: null at com.ibm.mqlight.api.impl.network.NettyNetworkService$ConnectListener.operationComplete(NettyNetworkService.java:340) ~[mqlight-api-1.0.2015060300.jar:1.0.2015060300] at com.ibm.mqlight.api.impl.network.NettyNetworkService$ConnectListener.operationComplete(NettyNetworkService.java:316) ~[mqlight-api-1.0.2015060300.jar:1.0.2015060300] at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:679) ~[netty-all-4.0.21.Final.jar:4.0.21.Final] at io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:844) [netty-all-4.0.21.Final.jar:4.0.21.Final] at io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:872) [netty-all-4.0.21.Final.jar:4.0.21.Final] at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:370) [netty-all-4.0.21.Final.jar:4.0.21.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) [netty-all-4.0.21.Final.jar:4.0.21.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) [netty-all-4.0.21.Final.jar:4.0.21.Final] at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) [netty-all-4.0.21.Final.jar:4.0.21.Final] at java.lang.Thread.null(Unknown Source) [na:1.8.0_40]

chrispwhite commented 9 years ago

Hi Bernard

Thanks for the information. I've managed to recreate the problem in a debugger.

The issue seems to be a small window in that during channel initialization should a network error occur the ConnectListener.operationComplete() method can be called with a success status, even though the channel has been torn down. This could be deemed as a bug in Netty, but I think the problem is really related to the way we are using Netty.

I have managed to the fix the issue and we should hopefully upload the code changes to github tomorrow, once we've passed it though our FV test automation.