Closed leachbj closed 9 years ago
Hi Bernard
Thanks for the information. I've managed to recreate the problem in a debugger.
The issue seems to be a small window in that during channel initialization should a network error occur the ConnectListener.operationComplete() method can be called with a success status, even though the channel has been torn down. This could be deemed as a bug in Netty, but I think the problem is really related to the way we are using Netty.
I have managed to the fix the issue and we should hopefully upload the code changes to github tomorrow, once we've passed it though our FV test automation.
On restarting the broker a NPE was thrown by the client. Once the service was restarted the client seemed to be blocked draining. Restarting the broker a second time doesn't seem to help. I was not able to reproduce this failure so hopefully the exception gives some indication of what may have gone wrong.
06/16 01:19:00.482 INFO [oopGroup-1024-8] c.g.o.s.StateMachine - Firing NETWORK_ERROR 06/16 01:19:00.486 INFO [oopGroup-1024-7] c.g.o.s.StateMachine - Firing NETWORK_ERROR 06/16 01:19:00.495 INFO [oopGroup-1024-7] c.g.o.s.StateMachine - Firing EP_RESP_OK 06/16 01:19:00.496 DEBUG[lt-dispatcher-4] c.m.m.m.p.a.i.IbmAsyncConsumer - Consumer lost connection to broker, retrying 06/16 01:19:00.497 INFO [oopGroup-1024-8] c.g.o.s.StateMachine - Firing EP_RESP_OK 06/16 01:19:00.506 DEBUG[lt-dispatcher-4] c.m.m.m.p.a.i.IbmAsyncProducer - Producer lost connection to broker, retrying 06/16 01:19:00.536 WARN [oopGroup-1024-6] i.n.u.c.DefaultPromise - An exception was thrown by com.ibm.mqlight.api.impl.network.NettyNetworkService$ConnectListener.operationComplete() java.lang.NullPointerException: null at com.ibm.mqlight.api.impl.network.NettyNetworkService$ConnectListener.operationComplete(NettyNetworkService.java:340) ~[mqlight-api-1.0.2015060300.jar:1.0.2015060300] at com.ibm.mqlight.api.impl.network.NettyNetworkService$ConnectListener.operationComplete(NettyNetworkService.java:316) ~[mqlight-api-1.0.2015060300.jar:1.0.2015060300] at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:679) ~[netty-all-4.0.21.Final.jar:4.0.21.Final] at io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:844) [netty-all-4.0.21.Final.jar:4.0.21.Final] at io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:872) [netty-all-4.0.21.Final.jar:4.0.21.Final] at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:370) [netty-all-4.0.21.Final.jar:4.0.21.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) [netty-all-4.0.21.Final.jar:4.0.21.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) [netty-all-4.0.21.Final.jar:4.0.21.Final] at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) [netty-all-4.0.21.Final.jar:4.0.21.Final] at java.lang.Thread.null(Unknown Source) [na:1.8.0_40]