atomashpolskiy / bt

BitTorrent library and client with DHT, magnet links, encryption and more
https://atomashpolskiy.github.io/bt/
Apache License 2.0
2.42k stars 381 forks source link

[BUG] Deadlock #183

Closed a8156268 closed 3 years ago

a8156268 commented 3 years ago

In a modified version, a deadlock occurred. I think the issue may still exist in standard version. This deadlock involved two peers. They trying to flush data to each other, and thus blocking data-receiver and so flushing is always failed and failed into a dead loop.

Stack of one peer:

"6881.bt.net.message-dispatcher-10.61.97.171:-1" #63080 daemon prio=5 os_prio=0 tid=0x000000005ed13000 nid=0x14ef1 runnable [0x00007f04c6bcd000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
    at sun.nio.ch.IOUtil.write(IOUtil.java:51)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
    - locked <0x00000001c9000078> (a java.lang.Object)
    at bt.net.pipeline.SocketChannelHandler.flush(SocketChannelHandler.java:148)
    - locked <0x0000000298e8e740> (a java.lang.Object)
    at bt.net.pipeline.SocketChannelHandler.send(SocketChannelHandler.java:71)
    at bt.net.SocketPeerConnection.postMessage(SocketPeerConnection.java:134)
    - locked <0x0000000298e8e790> (a bt.net.SocketPeerConnection)
    at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher$MessageDispatchingLoop.processSupplier(MultiThreadMessageDispatcher.java:179)
    at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher$MessageDispatchingLoop.run(MultiThreadMessageDispatcher.java:108)
    at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher.lambda$createAndSubmitTask$3(MultiThreadMessageDispatcher.java:289)
    at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher$$Lambda$746/1406993695.run(Unknown Source)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"6881.bt.runtime.shutdown.worker-4" #63231 daemon prio=5 os_prio=0 tid=0x0000000004fdc800 nid=0x14f8f waiting for monitor entry [0x00007f04d0c49000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at bt.net.pipeline.SocketChannelHandler.close(SocketChannelHandler.java:169)
    - waiting to lock <0x0000000298e8e740> (a java.lang.Object)
    - locked <0x0000000298e8e750> (a java.lang.Object)
    at bt.net.SocketPeerConnection.close(SocketPeerConnection.java:166)
    at bt.net.SocketPeerConnection.closeQuietly(SocketPeerConnection.java:154)
    at bt.net.PeerConnectionPool$$Lambda$845/1275417506.accept(Unknown Source)
    at bt.net.Connections$$Lambda$799/596951729.accept(Unknown Source)
    at java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4707)
    at bt.net.Connections.visitConnections(PeerConnectionPool.java:334)
    at bt.net.PeerConnectionPool.shutdown(PeerConnectionPool.java:270)
    at bt.net.PeerConnectionPool$$Lambda$630/585840387.run(Unknown Source)
    at bt.runtime.BtRuntime.lambda$toRunnable$7(BtRuntime.java:324)
    at bt.runtime.BtRuntime$$Lambda$678/1139650115.run(Unknown Source)
    at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

Stack of anther peer:


"6881.bt.net.data-receiver" #64560 prio=5 os_prio=0 tid=0x000000004a832000 nid=0x158c7 waiting for monitor entry [0x00007f18d6109000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at bt.net.pipeline.SocketChannelHandler.processInboundData(SocketChannelHandler.java:115)
        - waiting to lock <0x00000001979717f8> (a java.lang.Object)
        at bt.net.pipeline.SocketChannelHandler.read(SocketChannelHandler.java:82)
        at bt.net.pipeline.DefaultChannelPipeline$DefaultChannelHandlerContext.readFromChannel(DefaultChannelPipeline.java:181)
        at bt.net.DataReceivingLoop.processKey(DataReceivingLoop.java:187)
        at bt.net.DataReceivingLoop.run(DataReceivingLoop.java:128)
        at bt.net.DataReceivingLoop.lambda$null$1(DataReceivingLoop.java:65)
        at bt.net.DataReceivingLoop$$Lambda$679/99718958.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

"6881.bt.net.pool.cleaner" #64562 prio=5 os_prio=0 tid=0x000000003897d800 nid=0x158c9 waiting for monitor entry [0x00007f18d840b000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at bt.net.pipeline.SocketChannelHandler.close(SocketChannelHandler.java:169)
        - waiting to lock <0x00000001aff3dc48> (a java.lang.Object)
        - locked <0x00000001979717f8> (a java.lang.Object)
        at bt.net.SocketPeerConnection.close(SocketPeerConnection.java:166)
        at bt.net.SocketPeerConnection.closeQuietly(SocketPeerConnection.java:154)
        at bt.net.PeerConnectionPool.purgeConnection(PeerConnectionPool.java:264)
        at bt.net.PeerConnectionPool.access$200(PeerConnectionPool.java:48)
        at bt.net.PeerConnectionPool$Cleaner.lambda$run$0(PeerConnectionPool.java:249)
        at bt.net.PeerConnectionPool$Cleaner$$Lambda$795/1375733778.accept(Unknown Source)
        at bt.net.Connections$$Lambda$796/1447003786.accept(Unknown Source)
        at java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4707)
        at bt.net.Connections.visitConnections(PeerConnectionPool.java:334)
        at bt.net.PeerConnectionPool$Cleaner.run(PeerConnectionPool.java:239)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

"6881.bt.net.message-dispatcher-10.61.96.233:6881" #64794 daemon prio=5 os_prio=0 tid=0x0000000056929000 nid=0x159af runnable [0x00007f18c7e1e000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        at sun.nio.ch.IOUtil.write(IOUtil.java:51)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
        - locked <0x00000001aff3dbf8> (a java.lang.Object)  writeLock
        at bt.net.pipeline.SocketChannelHandler.flush(SocketChannelHandler.java:148)
        - locked <0x00000001aff3dc48> (a java.lang.Object)  outboundBufferLock
        at bt.net.pipeline.SocketChannelHandler.send(SocketChannelHandler.java:71)
        at bt.net.SocketPeerConnection.postMessage(SocketPeerConnection.java:134)
        - locked <0x00000001aff3dc88> (a bt.net.SocketPeerConnection)   synchronized
        at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher$MessageDispatchingLoop.processSupplier(MultiThreadMessageDispatcher.java:179)
        at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher$MessageDispatchingLoop.run(MultiThreadMessageDispatcher.java:108)
        at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher.lambda$createAndSubmitTask$3(MultiThreadMessageDispatcher.java:289)
        at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher$$Lambda$743/1741032101.run(Unknown Source)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

In both peers, message-dispatcher thread fall in a dead loop:

                while (buffer.hasRemaining()) {
                    channel.write(buffer);
                }

possibly because tcp write buffer is full, and because remote peer's tcp read buffer is full, and because remote peers's data-receiver thread is blocked.

atomashpolskiy commented 3 years ago

Thanks for reporting and making a PR!