atomashpolskiy / bt

BitTorrent library and client with DHT, magnet links, encryption and more
https://atomashpolskiy.github.io/bt/
Apache License 2.0
2.42k stars 381 forks source link

[QUESTION] Code affecting the performance #228

Open pvishnyakov opened 1 year ago

pvishnyakov commented 1 year ago

2 questions about the code that seems to be unnecessary or having strange logic affecting the download performance:

  1. Assignment.java has this method:

    private void claimPiecesIfNeeded() {
        if (pieces.size() < maxSimultaneouslyAssignedPieces) {
            final int numPiecesToAdd = maxSimultaneouslyAssignedPieces - pieces.size();
            PeerBitfield peerBitfield = pieceStatistics.getPeerBitfield(connectionKey).get();
            if (!assignments.isEndgame()) {
                BitSet relevantPieces = peerBitfield.getBitmask(); //returns a copy
                localBitfield.removeVerifiedPiecesFromBitset(relevantPieces);
                selector.getNextPieces(peerBitfield, pieceStatistics)
                        .filter(pieceIndex -> assignments.claim(pieceIndex))
                        .limit(numPiecesToAdd)
                        .forEach(pieceIndex -> pieces.add(pieceIndex));
            } else {
                // randomize order of pieces to keep the number of pieces
                // requested from different peers at the same time to a minimum
                int[] requiredPieces = selector.getNextPieces(peerBitfield, pieceStatistics).toArray();
                ShuffleUtils.shuffle(requiredPieces);
    
                for (int i = 0; i < Math.min(numPiecesToAdd, requiredPieces.length); i++) {
                    int pieceIndex = requiredPieces[i];
                    if (peerBitfield.isVerified(pieceIndex) && assignments.claim(pieceIndex)) {
                        pieces.add(pieceIndex);
                    }
                }
            }
        }
    }

What is the purpose of this part inside the "else" block?:

                // randomize order of pieces to keep the number of pieces
                // requested from different peers at the same time to a minimum
                int[] requiredPieces = selector.getNextPieces(peerBitfield, pieceStatistics).toArray();
                ShuffleUtils.shuffle(requiredPieces);

                for (int i = 0; i < Math.min(numPiecesToAdd, requiredPieces.length); i++) {
                    int pieceIndex = requiredPieces[i];
                    if (peerBitfield.isVerified(pieceIndex) && assignments.claim(pieceIndex)) {
                        pieces.add(pieceIndex);
                    }
                }

If I comment it all, download performance improved because the pieces are distributed between peers more evenly. Why this code is here?

  1. TorrentWorker.java has this method:
    private void processDisconnectedPeers(Assignments assignments, BitfieldBasedStatistics statistics) {
        ConnectionKey disconnectedPeer;
        while ((disconnectedPeer = disconnectedPeers.poll()) != null) {
            if (assignments != null) {
                Assignment assignment = assignments.get(disconnectedPeer);
                if (assignment != null) {
                    assignments.remove(assignment);
                    if (LOGGER.isTraceEnabled()) {
                        LOGGER.trace("Peer assignment removed due to DISCONNECT: peer {}, assignment {}",
                                disconnectedPeer, assignment);
                    }
                }
            }
            timeoutedPeers.remove(disconnectedPeer);
            if (statistics != null) {
                statistics.removeBitfield(disconnectedPeer);
            }
        }
    }

Why do we need timeoutedPeers.remove(disconnectedPeer) here? It's allowing the timeouted peer to bypass the ban by disconnecting and re-connecting again, this behavior can be observed in the logs when the same peer is being disconnected due to slow piece downloading but re-appears just to be banned over and over again. Maybe it's better to keep the peer banned all the time until the ban lifted according to Config settings?

pyckle commented 1 year ago

Hi,

1) This logic implements the bittorrent endgame mode as documented here: http://bittorrent.org/beps/bep_0003.html

I believe that this logic matches what libtorrent documents it does: https://github.com/steeve/libtorrent/blob/master/docs/settings.rst - see strict_end_game_mode.

It's puzzling to me that this reduces torrent performance. Roughly how fast are you downloading torrents (1 MB/s, 10MB/s, 100MB/s)? How many peers are you connected to? Does this affect performance only at the end of the torrent or for the whole download

2) I'm not familiar with this logic and don't have time to look into this.

pvishnyakov commented 1 year ago
  1. I didn't measure the exact speed in MB/s but testing the same file with client.startAsync(state->{...}, 5000) I see that every 5 seconds the number of downloaded pieces is increasing at double speed (like 8-9 new pieces vs. 3-4 every 5 second) after I remove that "else" block. Config.maxSimultaneouslyAssignedPieces was set to 1 all the time. Config.maxConcurrentlyActivePeerConnectionsPerTorrent was set to 50, before and after, no change.

I have no idea why it happens, seems that after "else" block deletion the pieces are distributed to more peers in parallel but I'm not sure. That's why I asked this question.