crazyjohn / crazyjohn.github.io

crazyjohn's blog
9 stars 3 forks source link

mina的一次死锁 #38

Open crazyjohn opened 8 years ago

crazyjohn commented 8 years ago

今天在Mina的messageReceived回调里处理System.exit(0)命令的时候,发生了死锁的情况,下面研究一下。

1. 堆栈信息

2016-01-11 11:51:23
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.65-b01 mixed mode):

"ShutdownHooksThread" #18 prio=5 os_prio=0 tid=0x000000005ada2800 nid=0x750 in Object.wait() [0x000000005d9be000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000d627df78> (a java.lang.Object)
    at org.apache.mina.core.service.IoServiceListenerSupport.disconnectSessions(IoServiceListenerSupport.java:296)
    - locked <0x00000000d627df78> (a java.lang.Object)
    at org.apache.mina.core.service.IoServiceListenerSupport.fireServiceDeactivated(IoServiceListenerSupport.java:181)
    at org.apache.mina.core.service.AbstractIoAcceptor.unbind(AbstractIoAcceptor.java:406)
    at org.apache.mina.core.service.AbstractIoAcceptor.unbind(AbstractIoAcceptor.java:324)
    at com.stupidbird.core.net.NodeIoProcessor.shutdown(NodeIoProcessor.java:50)
    at com.stupidbird.core.server.Commander.shutdown(Commander.java:43)
    at com.stupidbird.game.GameServer.shutdown(GameServer.java:103)
    at com.stupidbird.game.GameServer.lambda$0(GameServer.java:64)
    at com.stupidbird.game.GameServer$$Lambda$5/225553858.run(Unknown Source)
    at com.stupidbird.core.server.ServerNode$1.lambda$0(ServerNode.java:65)
    at com.stupidbird.core.server.ServerNode$1$$Lambda$8/1800070998.accept(Unknown Source)
    at java.util.concurrent.CopyOnWriteArrayList.forEach(CopyOnWriteArrayList.java:890)
    at com.stupidbird.core.server.ServerNode$1.run(ServerNode.java:64)

   Locked ownable synchronizers:
    - None

"NioProcessor-2" #24 prio=5 os_prio=0 tid=0x000000005ada1800 nid=0x1420 in Object.wait() [0x000000005d80e000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000d894f860> (a com.stupidbird.core.server.ServerNode$1)
    at java.lang.Thread.join(Thread.java:1245)
    - locked <0x00000000d894f860> (a com.stupidbird.core.server.ServerNode$1)
    at java.lang.Thread.join(Thread.java:1319)
    at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106)
    at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
    at java.lang.Shutdown.runHooks(Shutdown.java:123)
    at java.lang.Shutdown.sequence(Shutdown.java:167)
    at java.lang.Shutdown.exit(Shutdown.java:212)
    - locked <0x00000000d884b920> (a java.lang.Class for java.lang.Shutdown)
    at java.lang.Runtime.exit(Runtime.java:109)
    at java.lang.System.exit(System.java:971)
    at com.stupidbird.core.telnet.command.Shutdown.doExec(Shutdown.java:16)
    at com.stupidbird.core.telnet.command.TelnetCommand.exec(TelnetCommand.java:50)
    at com.stupidbird.core.telnet.TelnetIoHandler.messageReceived(TelnetIoHandler.java:64)
    at org.apache.mina.core.filterchain.DefaultIoFilterChain$TailFilter.messageReceived(DefaultIoFilterChain.java:690)
    at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:417)
    at org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1200(DefaultIoFilterChain.java:47)
    at org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceived(DefaultIoFilterChain.java:765)
    at org.apache.mina.filter.codec.ProtocolCodecFilter$ProtocolDecoderOutputImpl.flush(ProtocolCodecFilter.java:407)
    at org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecFilter.java:236)
    at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:417)
    at org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1200(DefaultIoFilterChain.java:47)
    at org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceived(DefaultIoFilterChain.java:765)
    at org.apache.mina.core.filterchain.IoFilterAdapter.messageReceived(IoFilterAdapter.java:109)
    at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:417)
    at org.apache.mina.core.filterchain.DefaultIoFilterChain.fireMessageReceived(DefaultIoFilterChain.java:410)
    at org.apache.mina.core.polling.AbstractPollingIoProcessor.read(AbstractPollingIoProcessor.java:710)
    at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:664)
    at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:653)
    at org.apache.mina.core.polling.AbstractPollingIoProcessor.access$600(AbstractPollingIoProcessor.java:67)
    at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1124)
    at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - <0x00000000d61fbdb8> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"GameMessageExecutor-0" #23 prio=5 os_prio=0 tid=0x000000005a744800 nid=0x131c waiting on condition [0x000000005d65e000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000d8761bc8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - None

"GameTickExecutor" #22 prio=5 os_prio=0 tid=0x000000005af8d800 nid=0x1734 waiting on condition [0x000000005d53e000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at com.stupidbird.core.server.ServerNode.tick(ServerNode.java:123)
    at com.stupidbird.core.server.ServerNode.lambda$0(ServerNode.java:81)
    at com.stupidbird.core.server.ServerNode$$Lambda$6/223000346.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - <0x00000000d60462f8> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"NioSocketAcceptor-2" #21 prio=5 os_prio=0 tid=0x000000005a495800 nid=0x8f0 runnable [0x000000005d35e000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method)
    at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:296)
    at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(WindowsSelectorImpl.java:278)
    at sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:159)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
    - locked <0x00000000d8786348> (a sun.nio.ch.Util$2)
    - locked <0x00000000d8786338> (a java.util.Collections$UnmodifiableSet)
    - locked <0x00000000d87861e8> (a sun.nio.ch.WindowsSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
    at org.apache.mina.transport.socket.nio.NioSocketAcceptor.select(NioSocketAcceptor.java:234)
    at org.apache.mina.core.polling.AbstractPollingIoAcceptor$Acceptor.run(AbstractPollingIoAcceptor.java:417)
    at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - <0x00000000d603fcb0> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"pool-4-thread-1" #20 prio=5 os_prio=0 tid=0x000000005a6da800 nid=0x1620 waiting on condition [0x000000005d1de000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000d8789e10> (a java.util.concurrent.SynchronousQueue$TransferStack)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
    at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
    at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - None

"pool-2-thread-1" #19 prio=5 os_prio=0 tid=0x000000005b1f0800 nid=0x113c waiting on condition [0x0000000059c7f000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000d87cd248> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
    at com.stupidbird.db.mongo.MongoSession.run(MongoSession.java:85)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - <0x00000000d5ff9180> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"MongoCleaner524223214" #17 daemon prio=5 os_prio=0 tid=0x000000005a6e9000 nid=0x14f0 waiting on condition [0x000000005cbcf000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at com.mongodb.Mongo$CursorCleanerThread.run(Mongo.java:820)

   Locked ownable synchronizers:
    - None

2. 查找锁等待的代码

  /**
     * Close all the sessions
     * TODO disconnectSessions.
     *
     */
    private void disconnectSessions() {
        if (!(service instanceof IoAcceptor)) {
            // We don't disconnect sessions for anything but an Acceptor
            return;
        }

        if (!((IoAcceptor) service).isCloseOnDeactivation()) {
            return;
        }

        Object lock = new Object();
        IoFutureListener<IoFuture> listener = new LockNotifyingListener(lock);

        for (IoSession s : managedSessions.values()) {
            s.close(true).addListener(listener);
        }

        try {
            synchronized (lock) {
                while (!managedSessions.isEmpty()) {
                    lock.wait(500);
                }
            }
        } catch (InterruptedException ie) {
            // Ignored
        }
    }

3. session.close(rightnow)

如果调用session.close(false)

    /**
     * {@inheritDoc}
     */
    public final CloseFuture close(boolean rightNow) {
        if (!isClosing()) {
            if (rightNow) {
                return close();
            }

            return closeOnFlush();
        } else {
            return closeFuture;
        }
    }

再看closeOnFlush

private final CloseFuture closeOnFlush() {
    getWriteRequestQueue().offer(this, CLOSE_REQUEST);
    getProcessor().flush(this);
    return closeFuture;
}

再看flush

/**
 * {@inheritDoc}
 */
public final void flush(S session) {
    // add the session to the queue if it's not already
    // in the queue, then wake up the select()
    if (session.setScheduledForFlush(true)) {
        flushingSessions.add(session);
        wakeup();
    }
}

然后flushingSessions会在io线程的run里进行对flush(currentTime)的方法来调用刷新。

4. 原因分析

  1. NioProcessor-2是mina消息的驱动线程,也就是它去执行jvm的退出,发现它在等待thread.join的方式,等待停机钩子线程执行完毕。
  2. ShutdownHooksThread是停机钩子线程。执行的堆栈代码就是上面的代码,它一直在等待管理的会话对象都移除掉,否则就在lock对象那里等待信号。
  3. 综合上头两步,发生了死锁。

    5. 修改代码以后ok

CloseFuture closed = session.close(false);
closed.addListener(new IoFutureListener<CloseFuture>() {

    @Override
    public void operationComplete(CloseFuture future) {
        System.exit(0);
    }
});