Closed adelavina closed 9 years ago
The implementation is designed for steady-state/daemon-like operation and the notion that it may lose contact to the DHT (due to network connectivity) at any point in time, so attempting to immediately issue a lookup task when it's still starting up (or has no network contact) will fail (by returning null) since there won't be any RPCServer instances that consider themselves active.
So if createPeerLookup
returns null you basically have to wait a little while and try again. There currently are no callbacks to determine activeness.
Alternatively you can instantiate the task directly and just assign it one of the servers even if they aren't active, with the downside that it may or may not return any results.
Another caveat is ipv6 support, if you don't have any v6 address at all then there might be 0 servers. You can turn off v6 in the config.xml and skip DHT instances where .isRunning()
is false if that's an issue.
As a sidenote, Key should make infohash handling a little easier, no need to fiddle with BigInts.
I do intend to add some sort of remote CLI that can issue commands to a running instance. Most likely DHT-ping IP or ID, get peer lists/scrape, get .torrent.
Genius.
I got it running as you suggested. I put some stdout flags on task Finish and after a bit over an hour later it is still running. This is the diagnostics:
==========================
DHT Diagnostics. Type IPV4_DHT
# of active servers / all servers: 1/1
-----------------------
Stats
Reachable node estimate: 1634426 (0.35929552596820646)
DB Keys: 33
DB Items: 63
TX sum: 12457 RX sum: 9736
avg task time/avg 1st result time (ms): 10836/9900
Uptime: PT1H14M12.507Ss
RPC stats
### local RPCs
REQ | RSP / Error / Timeout
PING 728|670/0/58
FIND_NODE 6820|4374/6/2471
GET_PEERS 542|274/0/269
ANNOUNCE_PEER 0|0/0/0
UNKNOWN 0|160/7/0
### remote RPCs
REQ / RSP
PING 192/192
FIND_NODE 2879/2868
GET_PEERS 849/849
ANNOUNCE_PEER 127/117
UNKNOWN 161/161
### non-associated errors
RX / TX 7/181
-----------------------
Routing table
buckets: 25 / entries: 184
000... num:8 rep:8
0010... num:8 rep:8
00110000... num:8 rep:8
001100010... num:8 rep:8
00110001100... num:8 rep:8
00110001101000000... num:8 rep:8
001100011010000010... num:8 rep:8
0011000110100000110... num:8 rep:8
00110001101000001110... num:8 rep:8
001100011010000011110... num:8 rep:4
0011000110100000111110... num:5 rep:0
001100011010000011111100... num:1 rep:0
001100011010000011111101... num:8 rep:0 [Home]
00110001101000001111111... num:2 rep:0
0011000110100001... num:8 rep:8
001100011010001... num:8 rep:8
00110001101001... num:8 rep:8
0011000110101... num:8 rep:8
001100011011... num:8 rep:8
0011000111... num:8 rep:8
0011001... num:8 rep:8
001101... num:8 rep:8
00111... num:8 rep:8
01... num:8 rep:8
1... num:8 rep:8
-----------------------
RPC Servers
31A0FD21 09491509 3EFD4EB1 2AE99D76 9B2D1105 bind: /104.237.143.114 consensus: /104.237.143.114:49001
rx: 9738 tx:12458 active:0 baseRTT:713 uptime:PT1H13M59.542S
-----------------------
Lookup Cache
anchors (0):
buckets (1) / entries (0):
all entries: 0
-----------------------
Tasks
next id: 20
#### active:
#### queued:
==========================
DHT Diagnostics. Type IPV6_DHT
# of active servers / all servers: 1/1
-----------------------
Stats
Reachable node estimate: 8234 (0.3917297321360904)
DB Keys: 62
DB Items: 116
TX sum: 9784 RX sum: 8880
avg task time/avg 1st result time (ms): 10168/9901
Uptime: PT1H13M59.502Ss
RPC stats
### local RPCs
REQ | RSP / Error / Timeout
PING 513|498/0/15
FIND_NODE 6887|6018/0/873
GET_PEERS 123|103/0/20
ANNOUNCE_PEER 0|0/0/0
UNKNOWN 0|7/0/0
### remote RPCs
REQ / RSP
PING 237/237
FIND_NODE 474/474
GET_PEERS 1109/1109
ANNOUNCE_PEER 433/433
UNKNOWN 0/0
### non-associated errors
RX / TX 0/8
-----------------------
Routing table
buckets: 14 / entries: 106
000... num:8 rep:8
0010... num:8 rep:8
00110000... num:8 rep:8
001100010... num:8 rep:8
00110001100... num:8 rep:7
0011000110100... num:4 rep:0 [Home]
0011000110101... num:6 rep:0
001100011011... num:8 rep:0
0011000111... num:8 rep:8
0011001... num:8 rep:8
001101... num:8 rep:8
00111... num:8 rep:8
01... num:8 rep:8
1... num:8 rep:8
-----------------------
RPC Servers
31A0FD21 09491509 3EFD4EB1 2AE99D76 9B2D1105 bind: /2600:3c00:0:0:f03c:91ff:fe84:6e9c%eth0 consensus: /2600:3c00:0:0:f03c:91ff:fe84:6e9c:49001
rx: 8880 tx:9784 active:0 baseRTT:237 uptime:PT1H13M59.488S
-----------------------
Lookup Cache
anchors (0):
buckets (1) / entries (0):
all entries: 0
-----------------------
Tasks
next id: 21
#### active:
#### queued:
and after a bit over an hour later it is still running.
The DHT itself never terminates. Like I said, it's meant for daemon-like operation.
If you mean the task, the diagnostics don't show any active tasks. so either the task never started or it did indeed finish or encountered some error. Check the exceptions.log file and/or raise the log level to Debug
to track task activity/termination (grep the logfile for "Task"). Also make sure to register the task listener before enqueuing the task with the taskmanager.
Beyond that the diagnostics look fine.
### local RPCs
GET_PEERS 542|274/0/269
This indicates that a peer lookup task (or several) were at least started.
Do note that I haven't exercised active lookups in a while, so it is possible that there are some bugs.
I got it to work. It seems m2e/mvn were having problems with the encoding of some of the hardcoded non ascii strings in the bencode.Utils class.!
So in the callFinish override for the PeerLookupTask I'm getting the DBItems which I assumed earlier today are the actual peers that are sharing the torrent corresponding to my infohash. In the background I still have the DHT daemons running which I think I understand a bit, at least the zsets on redis are pretty clear.
Does the efficiency of my PeerLookupTask increase with the amount of previous discovery that the DHT has performed before actually launching them? I'm thinking it would since the buckets would contain more entries before? Looking at some of the classes it seems measuring stuff has been in your scope, any tip?
I gave a thought to what you mentioned regarding a CLI, I'm thinking that pushing the Launcher you have into an embedded jetty with a few controllers to interact with the DHT could do it really easy.
I got it to work. It seems m2e/mvn were having problems with the encoding of some of the hardcoded non ascii strings in the bencode.Utils class.!
hum, added utf-8 configuration to maven, i hope that solves it.
So in the callFinish override for the PeerLookupTask I'm getting the DBItems which I assumed earlier today are the actual peers that are sharing the torrent corresponding to my infohash.
You should use TaskListener
to await completion and then use getReturnedItems()
.
Does the efficiency of my PeerLookupTask increase with the amount of previous discovery that the DHT has performed before actually launching them?
the redis-stuff is not relevant, that's export-only for statistics-gathering. You can turn off that component if you don't need it.
Uptime and activity stabilizes the routing table and RTT estimates, which makes lookups more efficient. Recent activity also populates a cache that may speed up lookups in some cases.
Looking at some of the classes it seems measuring stuff has been in your scope, any tip?
The PeerLookupTask
has various tuning knobs to optimize them for speed or reduced traffic. I don't know what you intend to do, so all guidance I can provide is as I've said before, ideally they should be issued from an already-running instance. Starting the DHT takes considerable warmup-time (other implementations might be more tuned for fast startup, but that's not one of my goals).
If everything works well it should only take seconds to complete a lookup task.
I'm thinking that pushing the Launcher you have into an embedded jetty with a few controllers to interact with the DHT could do it really easy.
That certainly would be possible, but a webserver seems a bit heavyweight just for issuing a few commands. I think i'll roll my own.
By the way, you can implementing your own Component and add it to the config.xml as <component>
tag, that should make interacting with the launcher easier.
In the background I still have the DHT daemons running which I think I understand a bit, at least the zsets on redis are pretty clear.
daemons? plural? there should only be DHT process running at a time, otherwise they will interfere with each other.
Sorry I was thinking about the two DHT instances v4/v6 launched in parallel.
To be honest until you pointed me to your implementation I was looking to build a continuously running thread that would monitor a particular torrent so that I could eventually understand torrent lifecycle. I'm not sure how I can adapt the task concept into the monitoring idea. I guess launching tasks constantly one after the other could theoretically work, although I would be revisiting the same nodes/peers every run.
In terms of hardware I have virtually anything I could need.
I was looking to build a continuously running thread that would monitor a particular torrent
Monitoring the DHT is a fairly fuzzy approach for that purpose. It might be better to simply join the swarm with bittorrent connections and monitoring PEX instead, it should provide a more accurate view.
although I would be revisiting the same nodes/peers every run.
you would have to do that with the DHT anyway, it's request-response based after all, no continuously open connections that keep you updated. You should also keep in mind that nodes might rate-limit you if you hammer them with requests.
And of course you're not required to use tasks. mldht also provides all lower-level concepts of the DHT if you wish to roll your own.
I'm working one of this components for my app and so far going great. I tried, however, setting it all up in AWS.
When launching I bump in the exception log with:
[2015-04-12T20:40:37.456Z][Error] java.io.IOException: /fe80:0:0:0:cc9:84ff:fe7e:4c9%eth0 -> dht.transmissionbt.com/2001:41d0:c:5ac:2:0:0:1:6881 at lbms.plugins.mldht.kad.RPCServer$SocketHandler.writeEvent(RPCServer.java:603) at lbms.plugins.mldht.kad.RPCServer.fillPipe(RPCServer.java:415) at lbms.plugins.mldht.kad.RPCServer.dispatchCall(RPCServer.java:427) at lbms.plugins.mldht.kad.RPCServer.doCall(RPCServer.java:185) at lbms.plugins.mldht.kad.tasks.Task.lambda$rpcCall$42(Task.java:236) at lbms.plugins.mldht.kad.tasks.Task$$Lambda$55/204653054.run(Unknown Source) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.SocketException: Network is unreachable at sun.nio.ch.DatagramChannelImpl.send0(Native Method) at sun.nio.ch.DatagramChannelImpl.sendFromNativeBuffer(DatagramChannelImpl.java:536) at sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:513) at sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:477) at lbms.plugins.mldht.kad.RPCServer$SocketHandler.writeEvent(RPCServer.java:578) ... 12 more
And then the RCPServers never come up. The instance has one public IP but its NATed through the private IP:
$ ifconfig eth0 Link encap:Ethernet HWaddr 0e:c9:84:7e:04:c9 inet addr:10.0.3.15 Bcast:10.0.3.127 Mask:255.255.255.128 inet6 addr: fe80::cc9:84ff:fe7e:4c9/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1 RX packets:380511 errors:0 dropped:0 overruns:0 frame:0 TX packets:233027 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:474953668 (474.9 MB) TX bytes:22947638 (22.9 MB)
lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:60 errors:0 dropped:0 overruns:0 frame:0 TX packets:60 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:3264 (3.2 KB) TX bytes:3264 (3.2 KB)
Have you tried something like this?
Thanks.
Looks like
Of course... Thanks again!
Hey, Following our exchange in SO.
I followed your advice and tried to queue PeerLookupTasks. So in the launcher class I went for something like this, right before entering the synchronized empty cycle:
After hitting several NPE on the last line of that loop, I debugged to find dht.createPeerLookup(bytes) was coming down null since it can't find any randomActive server.
Any thoughts on what I'm doing wrong?
Thanks.