twitter-archive / cloudhopper-smpp

Efficient, scalable, and flexible Java implementation of the Short Messaging Peer to Peer Protocol (SMPP)
Other
381 stars 357 forks source link

Netty4 deadlock - DefaultSmppClient#destroy vs #bind #92

Open krasa opened 9 years ago

krasa commented 9 years ago

I was building a new performance test https://github.com/krasa/cloudhopper-smpp/commit/b458153ad8061968b777166999b2d87cb02d38b7 and when I call DefaultSmppClient#destroy while other threads are binding, then it will never shutdown. (reproducible in majority of runs, when the server is not running)

Netty 3 did not have this problem ( https://github.com/krasa/cloudhopper-smpp/commit/fd78835d74992e476b5a584f5b8bd65cc4947dd8 ).

Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode):

"DestroyJavaVM" prio=6 tid=0x0000000011254800 nid=0x2538 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"pool-1-thread-10" prio=6 tid=0x0000000011bc9000 nid=0x2d20 in Object.wait() [0x000000001348e000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000007ace27518> (a io.netty.channel.DefaultChannelPromise)
    at java.lang.Object.wait(Object.java:503)
    at io.netty.util.concurrent.DefaultPromise.awaitUninterruptibly(DefaultPromise.java:286)
    - locked <0x00000007ace27518> (a io.netty.channel.DefaultChannelPromise)
    at io.netty.channel.DefaultChannelPromise.awaitUninterruptibly(DefaultChannelPromise.java:135)
    at io.netty.channel.DefaultChannelPromise.awaitUninterruptibly(DefaultChannelPromise.java:28)
    at com.cloudhopper.smpp.impl.DefaultSmppClient.createConnectedChannel(DefaultSmppClient.java:311)
    at com.cloudhopper.smpp.impl.DefaultSmppClient.doOpen(DefaultSmppClient.java:234)
    at com.cloudhopper.smpp.impl.DefaultSmppClient.bind(DefaultSmppClient.java:203)
    at com.cloudhopper.smpp.demo.perftest.PerformanceClientMain2$ClientSessionTask.run(PerformanceClientMain2.java:220)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

"pool-1-thread-2" prio=6 tid=0x00000000119f8000 nid=0x3518 in Object.wait() [0x000000001274e000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000007ac706af8> (a io.netty.channel.DefaultChannelPromise)
    at java.lang.Object.wait(Object.java:503)
    at io.netty.util.concurrent.DefaultPromise.awaitUninterruptibly(DefaultPromise.java:286)
    - locked <0x00000007ac706af8> (a io.netty.channel.DefaultChannelPromise)
    at io.netty.channel.DefaultChannelPromise.awaitUninterruptibly(DefaultChannelPromise.java:135)
    at io.netty.channel.DefaultChannelPromise.awaitUninterruptibly(DefaultChannelPromise.java:28)
    at com.cloudhopper.smpp.impl.DefaultSmppClient.createConnectedChannel(DefaultSmppClient.java:311)
    at com.cloudhopper.smpp.impl.DefaultSmppClient.doOpen(DefaultSmppClient.java:234)
    at com.cloudhopper.smpp.impl.DefaultSmppClient.bind(DefaultSmppClient.java:203)
    at com.cloudhopper.smpp.demo.perftest.PerformanceClientMain2$ClientSessionTask.run(PerformanceClientMain2.java:220)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

"pool-1-thread-1" prio=6 tid=0x0000000011250800 nid=0x2610 in Object.wait() [0x00000000125ce000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000007ac043fd0> (a io.netty.channel.DefaultChannelPromise)
    at java.lang.Object.wait(Object.java:503)
    at io.netty.util.concurrent.DefaultPromise.awaitUninterruptibly(DefaultPromise.java:286)
    - locked <0x00000007ac043fd0> (a io.netty.channel.DefaultChannelPromise)
    at io.netty.channel.DefaultChannelPromise.awaitUninterruptibly(DefaultChannelPromise.java:135)
    at io.netty.channel.DefaultChannelPromise.awaitUninterruptibly(DefaultChannelPromise.java:28)
    at com.cloudhopper.smpp.impl.DefaultSmppClient.createConnectedChannel(DefaultSmppClient.java:311)
    at com.cloudhopper.smpp.impl.DefaultSmppClient.doOpen(DefaultSmppClient.java:234)
    at com.cloudhopper.smpp.impl.DefaultSmppClient.bind(DefaultSmppClient.java:203)
    at com.cloudhopper.smpp.demo.perftest.PerformanceClientMain2$ClientSessionTask.run(PerformanceClientMain2.java:220)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

"Monitor Ctrl-Break" daemon prio=6 tid=0x0000000011082000 nid=0x2254 runnable [0x000000001147f000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:152)
    at java.net.SocketInputStream.read(SocketInputStream.java:122)
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
    - locked <0x00000007ab11a2b8> (a java.io.InputStreamReader)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.BufferedReader.fill(BufferedReader.java:154)
    at java.io.BufferedReader.readLine(BufferedReader.java:317)
    - locked <0x00000007ab11a2b8> (a java.io.InputStreamReader)
    at java.io.BufferedReader.readLine(BufferedReader.java:382)
    at com.intellij.rt.execution.application.AppMain$1.run(AppMain.java:93)
    at java.lang.Thread.run(Thread.java:745)

"Service Thread" daemon prio=6 tid=0x000000000f144000 nid=0x3a0c runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x000000000f142800 nid=0x11e8 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x000000000f140800 nid=0x3170 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Attach Listener" daemon prio=10 tid=0x000000000f13f000 nid=0x3620 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x000000000f137800 nid=0x3440 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=8 tid=0x000000000f0df800 nid=0x2e18 in Object.wait() [0x000000001058f000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000007aae85608> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
    - locked <0x00000007aae85608> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

"Reference Handler" daemon prio=10 tid=0x000000000f0de000 nid=0x20c8 in Object.wait() [0x00000000101af000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000007aae85190> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:503)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
    - locked <0x00000007aae85190> (a java.lang.ref.Reference$Lock)

"VM Thread" prio=10 tid=0x000000000f0d8800 nid=0x24c0 runnable 

"GC task thread#0 (ParallelGC)" prio=6 tid=0x0000000001e3e000 nid=0x314 runnable 

"GC task thread#1 (ParallelGC)" prio=6 tid=0x0000000001e3f800 nid=0xf30 runnable 

"GC task thread#2 (ParallelGC)" prio=6 tid=0x0000000001e41800 nid=0x3488 runnable 

"GC task thread#3 (ParallelGC)" prio=6 tid=0x0000000001e43000 nid=0x2b88 runnable 

"VM Periodic Task Thread" prio=10 tid=0x000000000f145800 nid=0x21a8 waiting on condition 
jjlauer commented 9 years ago

Sometimes I think Netty4 should have taken a new project name due to its massive backwards incompatibility. Have you dug into what you think the underlying problem is?

krasa commented 9 years ago

It looks like a netty bug to me. I will dig deeper if you do not have a time.

krasa commented 9 years ago

testcase:

public class Netty4Deadlock {

    private static final Logger logger = LoggerFactory.getLogger(Netty4Deadlock.class);

    public static void main(String[] args) throws UnrecoverablePduException, SmppChannelException, InterruptedException, SmppTimeoutException {
        DefaultChannelGroup channels = new DefaultChannelGroup(GlobalEventExecutor.INSTANCE);
        Bootstrap clientBootstrap = new Bootstrap();
        NioEventLoopGroup workerGroup = new NioEventLoopGroup();
        clientBootstrap.group(workerGroup);
        clientBootstrap.channel(NioSocketChannel.class);
        clientBootstrap.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 10000);
        clientBootstrap.handler(new LoggingHandler());

        ChannelFuture connectFuture = clientBootstrap.connect(new InetSocketAddress("localhost", 5000));

        destroy(channels, workerGroup);

        logger.info("deadlock here:");
        connectFuture.awaitUninterruptibly();
        logger.info("done");
    }

    private static void destroy(DefaultChannelGroup channels, NioEventLoopGroup workerGroup) {
        logger.info("destroy");
        channels.close().awaitUninterruptibly();

        workerGroup.shutdownGracefully();

        try {
            // Wait until all threads are terminated.
            workerGroup.terminationFuture().sync();
        } catch (InterruptedException e) {
            logger.warn("Thread interrupted closing executors.", e);
        }
    }
}