the8472 / mldht

Bittorrent Mainline DHT implementation in java
Mozilla Public License 2.0
147 stars 45 forks source link

Actively looking for N seeds for a given infohash #1

Closed adelavina closed 9 years ago

adelavina commented 9 years ago

Hey, Following our exchange in SO.

I followed your advice and tried to queue PeerLookupTasks. So in the launcher class I went for something like this, right before entering the synchronized empty cycle:

    byte[] bytes = new BigInteger(<String INFO_HASH>, 16).toByteArray();
    if (bytes.length > 1 && bytes[0] == 0)
        bytes = Arrays.copyOfRange(bytes, 1, bytes.length);

    for (DHT dht : dhts.values()) {
        PeerLookupTask peerLookup = dht.createPeerLookup(bytes);
        dht.getTaskManager().addTask(peerLookup);
    }

After hitting several NPE on the last line of that loop, I debugged to find dht.createPeerLookup(bytes) was coming down null since it can't find any randomActive server.

Any thoughts on what I'm doing wrong?

Thanks.

the8472 commented 9 years ago

The implementation is designed for steady-state/daemon-like operation and the notion that it may lose contact to the DHT (due to network connectivity) at any point in time, so attempting to immediately issue a lookup task when it's still starting up (or has no network contact) will fail (by returning null) since there won't be any RPCServer instances that consider themselves active.

So if createPeerLookup returns null you basically have to wait a little while and try again. There currently are no callbacks to determine activeness.

Alternatively you can instantiate the task directly and just assign it one of the servers even if they aren't active, with the downside that it may or may not return any results.

Another caveat is ipv6 support, if you don't have any v6 address at all then there might be 0 servers. You can turn off v6 in the config.xml and skip DHT instances where .isRunning() is false if that's an issue.

As a sidenote, Key should make infohash handling a little easier, no need to fiddle with BigInts.


I do intend to add some sort of remote CLI that can issue commands to a running instance. Most likely DHT-ping IP or ID, get peer lists/scrape, get .torrent.

adelavina commented 9 years ago

Genius.

I got it running as you suggested. I put some stdout flags on task Finish and after a bit over an hour later it is still running. This is the diagnostics:

==========================
DHT Diagnostics. Type IPV4_DHT
# of active servers / all servers: 1/1
-----------------------
Stats
Reachable node estimate: 1634426 (0.35929552596820646)
DB Keys: 33
DB Items: 63
TX sum: 12457 RX sum: 9736
avg task time/avg 1st result time (ms): 10836/9900
Uptime: PT1H14M12.507Ss
RPC stats
### local RPCs
REQ | RSP / Error / Timeout
PING    728|670/0/58
FIND_NODE   6820|4374/6/2471
GET_PEERS   542|274/0/269
ANNOUNCE_PEER   0|0/0/0
UNKNOWN 0|160/7/0
### remote RPCs
REQ / RSP
PING    192/192
FIND_NODE   2879/2868
GET_PEERS   849/849
ANNOUNCE_PEER   127/117
UNKNOWN 161/161
### non-associated errors
RX / TX 7/181
-----------------------
Routing table
buckets: 25 / entries: 184
000...   num:8 rep:8
0010...   num:8 rep:8
00110000...   num:8 rep:8
001100010...   num:8 rep:8
00110001100...   num:8 rep:8
00110001101000000...   num:8 rep:8
001100011010000010...   num:8 rep:8
0011000110100000110...   num:8 rep:8
00110001101000001110...   num:8 rep:8
001100011010000011110...   num:8 rep:4
0011000110100000111110...   num:5 rep:0
001100011010000011111100...   num:1 rep:0
001100011010000011111101...   num:8 rep:0 [Home]
00110001101000001111111...   num:2 rep:0
0011000110100001...   num:8 rep:8
001100011010001...   num:8 rep:8
00110001101001...   num:8 rep:8
0011000110101...   num:8 rep:8
001100011011...   num:8 rep:8
0011000111...   num:8 rep:8
0011001...   num:8 rep:8
001101...   num:8 rep:8
00111...   num:8 rep:8
01...   num:8 rep:8
1...   num:8 rep:8
-----------------------
RPC Servers
31A0FD21 09491509 3EFD4EB1 2AE99D76 9B2D1105    bind: /104.237.143.114 consensus: /104.237.143.114:49001
rx: 9738 tx:12458 active:0 baseRTT:713 uptime:PT1H13M59.542S
-----------------------
Lookup Cache
anchors (0):
buckets (1) / entries (0):

all entries: 0
-----------------------
Tasks
next id: 20
#### active: 
#### queued: 

==========================
DHT Diagnostics. Type IPV6_DHT
# of active servers / all servers: 1/1
-----------------------
Stats
Reachable node estimate: 8234 (0.3917297321360904)
DB Keys: 62
DB Items: 116
TX sum: 9784 RX sum: 8880
avg task time/avg 1st result time (ms): 10168/9901
Uptime: PT1H13M59.502Ss
RPC stats
### local RPCs
REQ | RSP / Error / Timeout
PING    513|498/0/15
FIND_NODE   6887|6018/0/873
GET_PEERS   123|103/0/20
ANNOUNCE_PEER   0|0/0/0
UNKNOWN 0|7/0/0
### remote RPCs
REQ / RSP
PING    237/237
FIND_NODE   474/474
GET_PEERS   1109/1109
ANNOUNCE_PEER   433/433
UNKNOWN 0/0
### non-associated errors
RX / TX 0/8
-----------------------
Routing table
buckets: 14 / entries: 106
000...   num:8 rep:8
0010...   num:8 rep:8
00110000...   num:8 rep:8
001100010...   num:8 rep:8
00110001100...   num:8 rep:7
0011000110100...   num:4 rep:0 [Home]
0011000110101...   num:6 rep:0
001100011011...   num:8 rep:0
0011000111...   num:8 rep:8
0011001...   num:8 rep:8
001101...   num:8 rep:8
00111...   num:8 rep:8
01...   num:8 rep:8
1...   num:8 rep:8
-----------------------
RPC Servers
31A0FD21 09491509 3EFD4EB1 2AE99D76 9B2D1105    bind: /2600:3c00:0:0:f03c:91ff:fe84:6e9c%eth0 consensus: /2600:3c00:0:0:f03c:91ff:fe84:6e9c:49001
rx: 8880 tx:9784 active:0 baseRTT:237 uptime:PT1H13M59.488S
-----------------------
Lookup Cache
anchors (0):
buckets (1) / entries (0):

all entries: 0
-----------------------
Tasks
next id: 21
#### active: 
#### queued: 
the8472 commented 9 years ago

and after a bit over an hour later it is still running.

The DHT itself never terminates. Like I said, it's meant for daemon-like operation.

If you mean the task, the diagnostics don't show any active tasks. so either the task never started or it did indeed finish or encountered some error. Check the exceptions.log file and/or raise the log level to Debug to track task activity/termination (grep the logfile for "Task"). Also make sure to register the task listener before enqueuing the task with the taskmanager.

Beyond that the diagnostics look fine.

### local RPCs
GET_PEERS   542|274/0/269

This indicates that a peer lookup task (or several) were at least started.

Do note that I haven't exercised active lookups in a while, so it is possible that there are some bugs.

adelavina commented 9 years ago

I got it to work. It seems m2e/mvn were having problems with the encoding of some of the hardcoded non ascii strings in the bencode.Utils class.!

So in the callFinish override for the PeerLookupTask I'm getting the DBItems which I assumed earlier today are the actual peers that are sharing the torrent corresponding to my infohash. In the background I still have the DHT daemons running which I think I understand a bit, at least the zsets on redis are pretty clear.

Does the efficiency of my PeerLookupTask increase with the amount of previous discovery that the DHT has performed before actually launching them? I'm thinking it would since the buckets would contain more entries before? Looking at some of the classes it seems measuring stuff has been in your scope, any tip?

I gave a thought to what you mentioned regarding a CLI, I'm thinking that pushing the Launcher you have into an embedded jetty with a few controllers to interact with the DHT could do it really easy.

the8472 commented 9 years ago

I got it to work. It seems m2e/mvn were having problems with the encoding of some of the hardcoded non ascii strings in the bencode.Utils class.!

hum, added utf-8 configuration to maven, i hope that solves it.

So in the callFinish override for the PeerLookupTask I'm getting the DBItems which I assumed earlier today are the actual peers that are sharing the torrent corresponding to my infohash.

You should use TaskListener to await completion and then use getReturnedItems().

Does the efficiency of my PeerLookupTask increase with the amount of previous discovery that the DHT has performed before actually launching them?

the redis-stuff is not relevant, that's export-only for statistics-gathering. You can turn off that component if you don't need it.

Uptime and activity stabilizes the routing table and RTT estimates, which makes lookups more efficient. Recent activity also populates a cache that may speed up lookups in some cases.

Looking at some of the classes it seems measuring stuff has been in your scope, any tip?

The PeerLookupTask has various tuning knobs to optimize them for speed or reduced traffic. I don't know what you intend to do, so all guidance I can provide is as I've said before, ideally they should be issued from an already-running instance. Starting the DHT takes considerable warmup-time (other implementations might be more tuned for fast startup, but that's not one of my goals). If everything works well it should only take seconds to complete a lookup task.

I'm thinking that pushing the Launcher you have into an embedded jetty with a few controllers to interact with the DHT could do it really easy.

That certainly would be possible, but a webserver seems a bit heavyweight just for issuing a few commands. I think i'll roll my own.

By the way, you can implementing your own Component and add it to the config.xml as <component> tag, that should make interacting with the launcher easier.

the8472 commented 9 years ago

In the background I still have the DHT daemons running which I think I understand a bit, at least the zsets on redis are pretty clear.

daemons? plural? there should only be DHT process running at a time, otherwise they will interfere with each other.

adelavina commented 9 years ago

Sorry I was thinking about the two DHT instances v4/v6 launched in parallel.

To be honest until you pointed me to your implementation I was looking to build a continuously running thread that would monitor a particular torrent so that I could eventually understand torrent lifecycle. I'm not sure how I can adapt the task concept into the monitoring idea. I guess launching tasks constantly one after the other could theoretically work, although I would be revisiting the same nodes/peers every run.

In terms of hardware I have virtually anything I could need.

the8472 commented 9 years ago

I was looking to build a continuously running thread that would monitor a particular torrent

Monitoring the DHT is a fairly fuzzy approach for that purpose. It might be better to simply join the swarm with bittorrent connections and monitoring PEX instead, it should provide a more accurate view.

although I would be revisiting the same nodes/peers every run.

you would have to do that with the DHT anyway, it's request-response based after all, no continuously open connections that keep you updated. You should also keep in mind that nodes might rate-limit you if you hammer them with requests.

the8472 commented 9 years ago

And of course you're not required to use tasks. mldht also provides all lower-level concepts of the DHT if you wish to roll your own.

adelavina commented 9 years ago

I'm working one of this components for my app and so far going great. I tried, however, setting it all up in AWS.

When launching I bump in the exception log with:

[2015-04-12T20:40:37.456Z][Error] java.io.IOException: /fe80:0:0:0:cc9:84ff:fe7e:4c9%eth0 -> dht.transmissionbt.com/2001:41d0:c:5ac:2:0:0:1:6881 at lbms.plugins.mldht.kad.RPCServer$SocketHandler.writeEvent(RPCServer.java:603) at lbms.plugins.mldht.kad.RPCServer.fillPipe(RPCServer.java:415) at lbms.plugins.mldht.kad.RPCServer.dispatchCall(RPCServer.java:427) at lbms.plugins.mldht.kad.RPCServer.doCall(RPCServer.java:185) at lbms.plugins.mldht.kad.tasks.Task.lambda$rpcCall$42(Task.java:236) at lbms.plugins.mldht.kad.tasks.Task$$Lambda$55/204653054.run(Unknown Source) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.SocketException: Network is unreachable at sun.nio.ch.DatagramChannelImpl.send0(Native Method) at sun.nio.ch.DatagramChannelImpl.sendFromNativeBuffer(DatagramChannelImpl.java:536) at sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:513) at sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:477) at lbms.plugins.mldht.kad.RPCServer$SocketHandler.writeEvent(RPCServer.java:578) ... 12 more

And then the RCPServers never come up. The instance has one public IP but its NATed through the private IP:

$ ifconfig eth0 Link encap:Ethernet HWaddr 0e:c9:84:7e:04:c9 inet addr:10.0.3.15 Bcast:10.0.3.127 Mask:255.255.255.128 inet6 addr: fe80::cc9:84ff:fe7e:4c9/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1 RX packets:380511 errors:0 dropped:0 overruns:0 frame:0 TX packets:233027 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:474953668 (474.9 MB) TX bytes:22947638 (22.9 MB)

lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:60 errors:0 dropped:0 overruns:0 frame:0 TX packets:60 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:3264 (3.2 KB) TX bytes:3264 (3.2 KB)

Have you tried something like this?

Thanks.

the8472 commented 9 years ago

Looks like

adelavina commented 9 years ago

Of course... Thanks again!