transmission / transmission

Official Transmission BitTorrent client repository
https://transmissionbt.com
Other
12.02k stars 1.2k forks source link

peers retained in swarm non-randomly, by IP-address numerical order #6895

Open reardonia opened 3 months ago

reardonia commented 3 months ago

What is the issue?

Noticed this when looking at .resume files diagostically. It appears that the logic for retaining peer atoms causing the lowest-numbered (by IP address) peers to be retained over higher-address-number peers. This is triggered whenever the swarm/torrent hits the per-torrent peer cap, typically 20.

Not clear what the preference should be (time?)

Which application of Transmission?

transmission-daemon

Which version of Transmission?

Tr4 mainline

tearfur commented 3 months ago

It appears that the logic for retaining peer atoms causing the lowest-numbered (by IP address) peers to be retained over higher-address-number peers.

It doesn't AFAICT, what makes you think that?

reardonia commented 3 months ago

Gathering more data...stay tuned.

reardonia commented 3 months ago

It appears that the logic for retaining peer atoms causing the lowest-numbered (by IP address) peers to be retained over higher-address-number peers.

It doesn't AFAICT, what makes you think that?

Following up: peers are definitely dropped during peer_info_pulse() based on numeric order... MOSTLY.

I just ran two tests, on torrent (A) with 61 peers and and torrent (B) 89 peers.

For each torrent, during peer_info_pulse, the peer_info list is deduced to 20 (aka max peers for the torrent).

For torrent (A): All of the peers were TR_PEER_FROM_TRACKER. 41 peers were culled. In address order (aka "version sort"), peers 1-19 and 21 were retained, peers 20 and 22-61 were culled.

For torrent (B): 10 peers were TR_PEER_FROM_RESUME and 79 were TR_PEER_FROM_TRACKER. 9 of the TR_PEER_FROM_RESUME peers were culled. and 60 of the TR_PEER_FROM_RESUME were culled. Of the TR_PEER_FROM_TRACKER peers, again only the highest-ordered peers were culled.

So it seems to work list this: 1) if the peer is TR_PEER_FROM_RESUME, it is culled. 2) if the peer is TR_PEER_FROM_TRACKER, it is culled according to it's IP address, presumably in byte-order.

This is very reproducible; just compare the list before and after peer_info_pulse

reardonia commented 3 months ago

Question: should we use BEP42 as a means of prioritizing peers to retain?

reardonia commented 2 months ago

@tearfur any further thoughts? I'm certain the behavior is egregiously wrong today.

tearfur commented 2 months ago

Will get back to this later.

tearfur commented 2 months ago

I'll try my best to guess what is actually happening:

  1. Your tracker is returning peers in IP address order, so they are also stored in IP address order in the pool.
  2. These freshly returned peers initially are all considered the same by peer_info_pulse_helpers::ComparePeerInfo.
  3. std::partial_sort() will only move around elements within the range [first, middle) if the elements in [first, last) are all equal.

    https://github.com/transmission/transmission/blob/ec5296c8dc15c0d51014d67ef93658e5569f354c/libtransmission/peer-mgr.cc#L2365

  4. The peers with bigger IP addresses (a.k.a. elements within [middle, last)) are kept in-place after std::partial_sort(), then got erased.

Also, did you actually mean BEP-40 instead of BEP-42?

reardonia commented 2 months ago

I'll try my best to guess what is actually happening:

  1. Your tracker is returning peers in IP address order, so they are also stored in IP address order in the pool.

No, both private trackers I'm testing with use random address order for peer (based on time-seeding AFAICT)

  1. These freshly returned peers initially are all considered the same by peer_info_pulse_helpers::ComparePeerInfo.
  2. std::partial_sort() will only move around elements within the range [first, middle) if the elements in [first, last) are all equal. https://github.com/transmission/transmission/blob/ec5296c8dc15c0d51014d67ef93658e5569f354c/libtransmission/peer-mgr.cc#L2365
  3. The peers with bigger IP addresses (a.k.a. elements within [middle, last)) are kept in-place after std::partial_sort(), then got erased.

I guess? honestly haven't tried to debug this yet, just reporting the phenomenon. It's 100% reproducible.

Also, did you actually mean BEP-40 instead of BEP-42?

Yes.

tearfur commented 2 months ago

Ah, I see what's happening now: (2) - (4) of my hypothesis is correct, and the reason for the peers in the pool to be sorted is simply because of the container is a sorted map:

https://github.com/transmission/transmission/blob/24f1e15767a1cd621256cf8f32287e9b1592d6d3/libtransmission/peer-mgr.cc#L289

Pentaphon commented 2 months ago

@reardonia I have a feeling you and Tearfur together are going to make Transmission 4.1.x the most optimized version in this client's history thus far.

You're really on a roll spotting these long overlooked issues. Great work.