atomashpolskiy / bt

BitTorrent library and client with DHT, magnet links, encryption and more
https://atomashpolskiy.github.io/bt/
Apache License 2.0
2.4k stars 382 forks source link

[BUG] Another dead lock #182

Closed a8156268 closed 3 years ago

a8156268 commented 3 years ago
"6890.bt.peer.peer-collector" #257 daemon prio=5 os_prio=0 tid=0x0000000041608000 nid=0x1cd waiting on condition [0x00007f84cdb76000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000030b8be2b0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
        at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
        at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
        at bt.net.PeerConnectionPool.getConnection(PeerConnectionPool.java:87)
        at bt.net.ConnectionSource.getExistingOrPendingConnection(ConnectionSource.java:169)
        at bt.net.ConnectionSource.getConnectionAsync(ConnectionSource.java:89)
        at bt.torrent.messaging.TorrentWorker.onPeerDiscovered(TorrentWorker.java:412)
        - locked <0x00000003966c40d0> (a bt.torrent.messaging.TorrentWorker)
        at bt.torrent.messaging.TorrentWorker.lambda$new$0(TorrentWorker.java:108)
        at bt.torrent.messaging.TorrentWorker$$Lambda$551/1146610865.accept(Unknown Source)
        at bt.event.EventBus.lambda$addListener$3(EventBus.java:247)
        at bt.event.EventBus$$Lambda$340/1587264368.accept(Unknown Source)
        at bt.event.EventBus.doFireEvent(EventBus.java:181)
        at bt.event.EventBus.fireEvent(EventBus.java:164)
        at bt.event.EventBus.firePeerDiscovered(EventBus.java:62)
        at bt.peer.PeerRegistry.addPeer(PeerRegistry.java:214)
        at bt.peer.PeerRegistry.queryPeerSource(PeerRegistry.java:194)
        at bt.peer.PeerRegistry.lambda$null$2(PeerRegistry.java:137)
        at bt.peer.PeerRegistry$$Lambda$599/1428107455.accept(Unknown Source)
        at java.lang.Iterable.forEach(Iterable.java:75)
        at bt.peer.PeerRegistry.lambda$collectAndVisitPeers$3(PeerRegistry.java:136)
        at bt.peer.PeerRegistry$$Lambda$536/629938218.accept(Unknown Source)
        at java.util.concurrent.ConcurrentHashMap$KeySetView.forEach(ConcurrentHashMap.java:4649)
        at java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080)
        at bt.peer.PeerRegistry.collectAndVisitPeers(PeerRegistry.java:104)
        at bt.peer.PeerRegistry$$Lambda$534/1000320184.run(Unknown Source)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

"6890.bt.net.pool.cleaner" #255 daemon prio=5 os_prio=0 tid=0x000000000667a800 nid=0x1cb waiting on condition [0x00007f84c6e84000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000030b8c7360> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
        at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
        at bt.event.EventBus.fireEvent(EventBus.java:156)
        at bt.event.EventBus.firePeerDisconnected(EventBus.java:80)
        at bt.net.PeerConnectionPool.purgeConnection(PeerConnectionPool.java:265)
        at bt.net.PeerConnectionPool.access$200(PeerConnectionPool.java:48)
        at bt.net.PeerConnectionPool$Cleaner.lambda$run$0(PeerConnectionPool.java:249)
        at bt.net.PeerConnectionPool$Cleaner$$Lambda$647/389929724.accept(Unknown Source)
        at bt.net.Connections$$Lambda$648/318921923.accept(Unknown Source)
        at java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4707)
        at bt.net.Connections.visitConnections(PeerConnectionPool.java:334)
        at bt.net.PeerConnectionPool$Cleaner.run(PeerConnectionPool.java:239)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
pyckle commented 3 years ago

Currently, PeerConnectionPool.getConnection() does not lock anything. See: https://github.com/atomashpolskiy/bt/blob/89c8fd2ff255f7354cd59e178889439da5b28716/bt-core/src/main/java/bt/net/PeerConnectionPool.java#L87

I don't see any calls to connectionLock.lock() even in older versions. Are you using a modified version of bt that added this lock? If so, why was this locking necessary?

a8156268 commented 3 years ago

Yes, I did some modification. This is to reducing the connection establish count.

    @Override
    public PeerConnection getConnection(Peer peer, TorrentId torrentId) {
        connectionLock.lock();
        try {
            List<PeerConnection> connectionsWithSameAddress = getConnectionsForAddress(torrentId, peer);
            if (LOGGER.isDebugEnabled()) {
                LOGGER.debug("Checking duplicate connections for newly discovered peer: {}." +
                        " All connections:\n{}", peer, connectionsWithSameAddress.stream()
                        .map(Object::toString)
                        .collect(Collectors.joining("\n")));
            }
            for (PeerConnection connection : connectionsWithSameAddress) {
                if (connection.getRemotePort() == peer.getPort()) {
                    return connection;
                } else if (connection.getRemotePeer().getPort() == peer.getPort()) {
                    return connection;
                }
            }
            return null;
        } finally {
            connectionLock.unlock();
        }
    }
pyckle commented 3 years ago

It would seem that this is not a bug then - it's a request for assisting in developing a new feature.

I'm not very familiar with the code in this place, and I can see that you changed other parts of the code from the excerpted code. In general, there's a few strategies that one can use to avoid deadlocks.

Perhaps one of these strategies will help you avoid the deadlock in your feature.

a8156268 commented 3 years ago

Sure. Thank you.