Closed atomashpolskiy closed 7 years ago
The problem is that when I create a PeerLookupTask via lbms.plugins.mldht.kad.DHT#createPeerLookup, I'm receiving null as a result. I've done some debugging, and it seems that it's because there are no active RPC servers (lbms.plugins.mldht.kad.RPCServerManager#activeServers is empty).
createPeerLookup
does a fallback to non-active servers if no active one can be found. So it sounds like there is no server at all, i.e. DHT
initialization is not done yet.
You can use lbms.plugins.mldht.kad.DHT.addStatusListener(DHTStatusListener)
to get notified when initialization is done. But maybe I should add a CompletionStage
that gets resolved once an active server becomes available.
Also it appears to me that the call to lbms.plugins.mldht.kad.DHT#resolveBootstrapAddresses should be omitted when router bootstrap is disabled in config
Good point. But if you disable router bootstrap you will have to seed the DHT in some other way, e.g. from peers sending the PORT message by calling lbms.plugins.mldht.kad.DHT.addDHTNode(String, int)
I'm pretty sure that the fallback is not used: https://github.com/the8472/mldht/blob/master/src/lbms/plugins/mldht/kad/DHT.java#L551
Good point. But if you disable router bootstrap you will have to seed the DHT in some other way, e.g. from peers sending the PORT message by calling lbms.plugins.mldht.kad.DHT.addDHTNode(String, int)
Yeah, feeding peers received from other sources into DHT is what I'm up to right now. Thanks for pointing out the API to use, this was going to be my next question :)
Ah right, the fallback is only used for maintenance tasks such as adding new nodes. I'll add a callback then to notify when an active server becomes available.
Just to be clear, I have a thread that periodically calls createPeerLookup
, and each time it returns null. So I'm not sure that this is a premature calling problem, it rather seems like something broke/went out of sync completely during DHT startup... As I've said, this problem didn't appear until some of the bootstrap nodes went down and lbms.plugins.mldht.kad.DHT#resolveBootstrapAddresses
began to take longer time than usual to execute.
lbms.plugins.mldht.kad.DHT.printDiagnostics(PrintWriter)
can provide a lot of diagnostic output which might help.
Running with diagnostics, here's the result. Do you think there's any clue here?
==========================
DHT Diagnostics. Type IPV4_DHT
# of active servers / all servers: 0/1
-----------------------
Stats
Reachable node estimate: 2 (0.5)
DB Keys: 0
DB Items: 0
TX sum: 0 RX sum: 0
avg task time/avg 1st result time (ms): 10000/10000
Uptime: PT3M55.397Ss
RPC stats
### local RPCs
Method REQ | RSP Error Timeout
PING 0 | 0 0 0
FIND_NODE 0 | 0 0 0
GET_PEERS 0 | 0 0 0
ANNOUNCE_PEER 0 | 0 0 0
GET 0 | 0 0 0
PUT 0 | 0 0 0
SAMPLE_INFOHASHES 0 | 0 0 0
UNKNOWN 0 | 0 0 0
### remote RPCs
Method REQ | RSP Errors
PING 0 | 0 0
FIND_NODE 0 | 0 0
GET_PEERS 0 | 0 0
ANNOUNCE_PEER 0 | 0 0
GET 0 | 0 0
PUT 0 | 0 0
SAMPLE_INFOHASHES 0 | 0 0
UNKNOWN 0 | 0 0
-----------------------
Routing table
buckets: 1 / entries: 0
all num:0 rep:0 [Home]
-----------------------
RPC Servers
D1DBC867 01EA22A7 2202C67D FCC5AF84 13AD3EEF bind: /192.168.1.2 consensus: null
rx: 0 tx: 0 active: 0 baseRTT: 10000 loss: 0,500000 loss (verified): 0,500000 uptime: PT3M23.101S
RTT stats (0samples) mean:9975.0 median:9975.0 mode:9975.0 10tile:9975.0 90tile:9975.0
9950 |
100% |
-----------------------
Blacklist
{}
-----------------------
Lookup Cache
anchors (0):
buckets (1) / entries (0):
all entries: 0
-----------------------
Tasks
next id: 1
#### active:
#### queued:
DHT has not been seeded with contacts yet -> 0 routing table entries -> no traffic -> server can't be active
Yeah, right. This return
statement is preventing DHT from bootstrapping normally when one of the router addresses can't be resolved: https://github.com/the8472/mldht/blob/master/src/lbms/plugins/mldht/kad/DHT.java#L966
try with the current revision. you can use the CompletableFuture from the ServerManager to wait until ones become available.
Working perfectly now, thanks a lot!
You might also want to change the call to bootstrap in 'started' to be async btw
Hi @the8472 !
I'm in the process of integrating mldht into https://github.com/atomashpolskiy/bt/tree/dht-experimental. So far it's been working great, I've been able to download a complete torrent using DHT exclusively (no trackers, no PEX). So in the first place I'd like to thank you for developing this!
Currently I'm facing a weird problem. Coincidentally it appears that one of the bootstrap nodes is down at the moment, so this might be a part of the problem.
The problem is that when I create a PeerLookupTask via
lbms.plugins.mldht.kad.DHT#createPeerLookup
, I'm receivingnull
as a result. I've done some debugging, and it seems that it's because there are no active RPC servers (lbms.plugins.mldht.kad.RPCServerManager#activeServers
is empty). There is one server, created viaAddressUtils#getDefaultRoute
, but it's considered unreachable due to nothing being received andtimeOfLastReceiveCountChange
being 0.I wonder if there might be some kind of a race condition in DHT/RPC startup, because DHT hangs for a while in
lbms.plugins.mldht.kad.DHT#resolveBootstrapAddresses
, resulting in the exception:Also it appears to me that the call to
lbms.plugins.mldht.kad.DHT#resolveBootstrapAddresses
should be omitted when router bootstrap is disabled in config (of course unless it has some important side effects that I'm not aware of).Thanks again and pardon my poor language! :)