0xcaff / dht-crawler

Tools to crawl and index the BitTorrent DHT.
40 stars 5 forks source link

Network Penetration #12

Closed 0xcaff closed 5 years ago

0xcaff commented 6 years ago

We would like to maximize the number of IP addresses and torrent infohashes collected.

We will collect infohashes whenever we see them. For example, whenever we receive a request for get_peer or announce_peer, the infohash will be collected.

We will collect IP addresses whenever someone calls get_peer or announce_peer. BEP42, says only peers with valid node ids are considered valid storage targets making it difficult to scale collection of announce_peer part up without many IP addresses.

Prior work in this area creates an abusive client which pretends to be near node id's it finds. https://github.com/boramalper/magnetico

We need to figure out a strategy which balances being good and collecting tons of information.

0xcaff commented 6 years ago

I think the best way to do this is to contact as many nodes as possible. This will increase the chance that we get contacted when the node is searching for something. We will need a number of IP addresses for this because of that one BEP proposal.

0xcaff commented 5 years ago

Let's focus on collecting ip addresses for the time being. We don't have to worry about nodes respecting us really.