Open raulk opened 5 years ago
this will be a problem for ipns, as the original publisher may be offline. cc @Stebalien
Note that IPNS does republish the name record every 4 hours:
But the way I see it, this is to deal with network churn, not because the DHT peers are actively pruning the record. Even if the IPNS record carries the expiration, all the DHT sees is a []byte
. Maybe via a Validator
it validates a record is not expired, but this would happen only reactively: when processing a PUT_VALUE or a GET_VALUE request.
I want to make sure I understand the reason that Kademlia requires a node to republish its key/value pairs every hour. Is it so that as peers enter and leave the DHT the key/value is rebalanced onto the nodes closest to the key? If that's the case then it should make IPNS work better, because it will be more likely to find 16 copies of the key/value pair.
@dirkmc in addition to that, it helps keep storage footprint in check -- although by itself it's insufficient to deter DoS attacks exploiting that vector.
I want to make sure I understand the reason that Kademlia requires a node to republish its key/value pairs every hour.
The requirement comes from the fact that the network isn't exactly stable.
We should definitely be throwing away old records, we even make sure to record the time received so we can do this. The main issue is that we simply don't sweep the datastore.
@dirkmc if you're up for making a contribution, we can guide you through it ;-)
Sure, I can do so - I've been working on the Javascript DHT so I will do the same there
Should we also do the same for ADD_PROVIDER
? It seems like it will suffer from the same issues if not republished.
provider records are already collected after 24hs, and changing that would be a difficult change to deploy.
Closing https://github.com/libp2p/go-libp2p-kad-dht/issues/354 in favor of this:
There are two parts to republishing:
See section 2.5 of https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf.
@Stebalien Do we plan to implement this anytime soon ?
I'd like to run a change like this against the real network for a while before merging it. Luckily, some of the test infrastructure we're working on will allow us to do that.
So yeah, go ahead! Just be warned that there may be a long road to merging it.
@Stebalien When you say test infrastructure, are you talking about the testlab network ?
Yes.
@Stebalien @raulk
Please can you assign this to me ?
There are two parts to republishing:
- The original publisher should republish periodically (4-12 hours)
So, after a successful PutValue call to the network, the dht instance that is the original publisher will start a go-routine that keeps calling PutValue for that key every 4-12 hours ? I see the paper mentions an interval of 24 hours or more. Why do we choose 4-12 hours ?
- Records should expire every 8-24 hours.
The Kad paper simply states that they require an expiry interval of 24 hours for file sharing records but we ALREADY expire provider records after 24 hours. This issue is for the PutValue records & we currently have a MaxRecordAge of 36 hours. Why do we choose an interval of 8-24 ?
- Nodes that have received a record should republish hourly.
See section 2.5 of https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf.
If we choose a scan interval randomly (say between 60-75), the first peer that fires a republish resets the received time on the record on all other peers that are still among the K closest to the key. This should minimise the number of re-publish calls(This optimisation is mentioned in the paper).
[x] Expire non-provider records in the datastore that are older than 12 hours.
[x] Original publisher should republish every 6 hours.
[x] Nodes that have received a record should republish every 60-75 minutes.
[ ] Test with libp2p-testlab
I see the paper mentions an interval of 24 hours or more. Why do we choose 4-12 hours?
You're right, 24 hours should be fine. The 4-12 hours was necessary because we don't currently rebalance.
The Kad paper simply states that they require an expiry interval of 24 hours for file sharing records but we ALREADY expire provider records after 24 hours. This issue is for the PutValue records & we currently have a MaxRecordAge of 36 hours. Why do we choose an interval of 8-24 ?
That's also a good point. File sharing records make sense (really, the client should choose but we can handle that later) but we IPNS records can last longer (36 hours is reasonable).
If we choose a scan interval randomly (say between 60-75), the first peer that fires a republish resets the received time on the record on all other peers that are still among the K closest to the key. This should minimise the number of re-publish calls(This optimisation is mentioned in the paper).
SGTM.
@bigs @Stebalien @raulk
This PR needs some love. Please take a look when you can :)
While implementing this feature, we've discovered a chain of dependencies. In reverse order:
https://bafz...ipns.io
.@Stebalien Thanks for the summery. If I read it well, considering the current state of affairs (at least go-ipfs), it seems that republishing is blocked by the lack of signatures for provider records. Is this correct?
Is there a current plan/spec/issue to implement signed provider records?
In reverse order, we've now finished steps 6-3. Unfortunately, the current go-ipfs team is tiny. Fortunately, this is now changing. Unfortunately, people currently moving back to the team are working towards other goals at the moment (@aarshkshah1992 will be working on NAT traversal, packet orientation, and connection establishment speed, for example).
We:
See discussion in https://discuss.libp2p.io/t/dht-hourly-key-value-republish/93/4.