arvidn / libtorrent

an efficient feature complete C++ bittorrent implementation
http://libtorrent.org
Other
5.24k stars 999 forks source link

"peer id" reported to tracker isn't used for peer connections #7479

Open glassez opened 1 year ago

glassez commented 1 year ago

According to The BitTorrent Protocol Specification during handshake peer should use the same peer_id as reported to the tracker:

After the download hash comes the 20-byte peer id which is reported in tracker requests and contained in peer lists in tracker responses. If the receiving side's peer id doesn't match the one the initiating side expects, it severs the connection.

But for some reason libtorrent uses different peer_id for each peer connection. Moreover, judging by the code comment near the "peer id" field of torrent: https://github.com/arvidn/libtorrent/blob/b82b350b38147ac7ddf6ec41027ebe07dc15f913/include/libtorrent/torrent.hpp#L1569-L1572 this was done on purpose, because of "Peers won't use this (they generate their own peer ids)" without any details of why do "peers won't use" the peer id that should be used according to the Specification

glassez commented 1 year ago

Related qBittorrent Issue.

luzpaz commented 1 year ago

@arvidn would you have a moment to look at this ?

glassez commented 1 year ago

@arvidn, welcome back! Could you take a look at this as a priority?

ghost commented 1 year ago

If libtorrent employs the same peer_id for all interfaces during communication with other peers, it will result in those peers not establishing connections with all peers originating from the user. This is because many clients consider peers with identical peer_id values as duplicates.

glassez commented 1 year ago

If libtorrent employs the same peer_id for all interfaces during communication with other peers, it will result in those peers not establishing connections with all peers originating from the user.

I don't know much about the details, but I think it should be something like the following. Announces through different interfaces use different peer_id. Peer connections associated with some interface must use the same peer_id that was announced to the tracker through this interface.

IMO, in order to support something, it is better to do it at the protocol level (i.e. change/supplement the protocol), and not just violate it.

SeaHOH commented 1 year ago

If libtorrent employs the same peer_id for all interfaces during communication with other peers, it will result in those peers not establishing connections with all peers originating from the user.

This does not only occur on each interfaces, even each ports. I can understand different interfaces imply different peers, maybe one of them has a good connection, but same torrent and same interface MUST report a coherent peer_id.

ghost commented 1 year ago

If libtorrent employs the same peer_id for all interfaces during communication with other peers, it will result in those peers not establishing connections with all peers originating from the user.

I don't know much about the details, but I think it should be something like the following. Announces through different interfaces use different peer_id. Peer connections associated with some interface must use the same peer_id that was announced to the tracker through this interface.

IMO, in order to support something, it is better to do it at the protocol level (i.e. change/supplement the protocol), and not just violate it.

If announcements employ distinct peer_id values for each interface, it will become challenging for trackers to maintain accurate statistics. This situation is particularly relevant to private trackers. Such behavior might be interpreted as unfair, given that statistics would be calculated per interface and result in multiplication of the recorded data.

ghost commented 1 year ago

If libtorrent employs the same peer_id for all interfaces during communication with other peers, it will result in those peers not establishing connections with all peers originating from the user.

This does not only occur on each interfaces, even each ports. I can understand different interfaces imply different peers, maybe one of them has a good connection, but same torrent and same interface MUST report a coherent peer_id.

In my opinion, it's entirely logical to assign distinct peer_id values for each port. Consider a scenario where a single interface has both an IPv4 and an IPv6 address. Utilizing the same peer_id for both IP versions could result in peers only connecting via one of the IPs. This might not align with your preferences, especially if one of the IP versions or a combined connection offers better speed.

SeaHOH commented 1 year ago

@Outbid9727 So, you prefer always report a same peer_id to trackers even different torrents, and different peer_ids to all other connections?

PS: my views is "same torrent and same interface MUST report a coherent peer_id", "an IPv4 and an IPv6 address" are different interfaces.

SeaHOH commented 1 year ago

And, I remember private trackers donot use peer_id to identify users.

SeaHOH commented 1 year ago

Sorry, maybe above I wrote was a bit off the point, I means we need to follow the BEPs.

arvidn commented 1 year ago

the peer ID in the bittorrent peer protocol has essentially been useless since its inception. There was a brief window, when trackers would still return peer IDs, where there was a slight use of them, as a poor man's authentication that the peer you're talking to was the same as talked to the tracker. This was removed with compact tracker responses (circa 2008).

Since then, both uTorrent and libtorrent (for well over 10 years by now) have randomized peer_ids to mitigate surveillance of users. This was reported by torrent-freak.

I think a strong justification would be needed to change this.

SeaHOH commented 1 year ago

I forgot a issue, qBittorrent also disconnected a second incoming connection for same peer, even through different IP. Who did this, qBittorrent or libtorrent?

ghost commented 1 year ago

@Outbid9727 So, you prefer always report a same peer_id to trackers even different torrents, and different peer_ids to all other connections?

No, I prefer the current version. Same peer_id reported for all interfaces/ports on announcement, different peer_id for peer connections. Also it doesn't matter to the tracker if the peer_id sent to it is different from the peer_id used for peer connections. This is because trackers no longer return peer_ids to clients. Most have resorted to compact lists.

Also it seems like the issue your reported in https://github.com/qbittorrent/qBittorrent/issues/19391 is not due to peer id but because of multiple connections to same IP address being disabled in your settings. The example you provided where same IP has different ports, those are most likely different peers using the same external IP address.

SeaHOH commented 1 year ago

The example you provided where same IP has different ports, those are most likely different peers using the same external IP address.

Good catch! I did not considered that. @arvidn How does libtorrent do in the case (accept or reject), or left it to the client code?