Closed Jorropo closed 10 months ago
cc @dennis-tra @guillaumemichel, @BigLep really wants us to have a test here, I don't have time to do it, could you please pick it up ?
I cannot take this up at the moment 👎🏻
@dennis-tra : are you able to take? My understanding is that https://github.com/libp2p/go-libp2p-kad-dht/pull/870 hit a regression. Bummer, but that's life. I like to hold to the standard that we at least add regression tests when we trigger breakage/bugs. (Or is this code that will all be going away in the next month with refactoring?)
@BigLep, @Jorropo and I have had discussions around #870 and we think differently about it. From my point of view it's certainly not a regression but instead puts a spotlight on the missing filters.
The increased pressure on the smart dialler is dwarfed by the speed-up of skipping a second DHT walk to find the addresses.
That being said, I'm afraid I'm not sure what a regression test should test if I don't see a regression here.
I could verify that the literal changes of the code in this PR do what they aim to do - filter addresses when storing/fetching/sending multiaddresses to/from the peerstore or other peers. However, that's only implicitly related to increased pressure on the smart dialler and I wouldn't call it a regression test. Terminology aside, I'm happy to add these tests!
However, this area of the code will and has changed in the v2 codebase. I've already ported the handlers to the v2-develop
branch in #864 where I also already test the address filtering.
One genuine note to the end, I really appreciate that @Jorropo you sharpen my eye for the implications of seemingly minor changes, as in #870.
@dennis-tra I don't understand what you mean, this is not reverting your change (#870), but applying address filters, right now your change cache private and public ips on the wan dht of kubo, instead only public ips will be cached.
@Jorropo I see you're not reverting #870 here and I also understand the significance of not caching private IPs.
Let's get this change in and if we get the chance to chat in sync about it, do it during one of the next co-los.
Terminology aside, I'm happy to add these tests!
The impact of the original change here is expected to be significant, i.e., avoiding a second DHT lookup (that's our hope and we've got all the measurements in place to monitor the result). So, I'd say let's add these tests (I agree they're not regression tests as this PR here doesn't fix a regression, but it's an optimisation on top of https://github.com/libp2p/go-libp2p-kad-dht/pull/870. I'd call them tests on the optimisation proposed by @Jorropo) and make sure this goes into the next libp2p release.
Thanks all. I'm not close to the details and not going to dive in more unless needed. I just want to make sure that we hold the line of making sure the any new code we write or touch has the proper test coverage. Basically, we aren't going to go on a campaign to fix the sins of the past, but I want to make sure our present and future selves does the right thing, and when we get tripped up by decisions of the past that we bring them up to our desired standard. As long as we're following that principle, I'm good. And if we're not following it, I want to make sure we're upfront and intentional about it (where this can mean something like "not adding a backfill test because this code will be replaced in one month and the proper assertions have been added to the new code in commit X").
added two tests where testing infrastructure was already present.
I think the serving part is missing. When we hand out addresses we should also filter them. When the peer connects to us to store provider records we'll store its addresses (private + public) in the peerstore because of the identify exchange. This means in the first 30min (the ttl for recently connected peers) we serve both. The same applies when we're connected to the peer.
Thank you @dennis-tra for the tests!
All calls to add peers to the peerstore go through maybeAddAddrs
The only call that isn't filtered is the following:
We cannot call maybeAddAddrs
because the ProviderManager
doesn't have a reference to the IpfsDHT
. Hence we would need to add a filter attribute to the ProviderManager
and make sure that the mentioned call to peerstore.AddAddrs
provides filtered addresses.
Edit:
The only caller to ProviderStore.AddProvider
filters the multiaddrs before the call so we should be good.
The other callers (IpfsDHT.Provide
and FullRT.Provide
) add content that is being provided by the host, and do not provide any addrs (because we don't want to add self to the peerstore).
@Jorropo can you have a last look before we merge this?
Thx a lot for doing this, LGTM I havn't reviewed the tests in detail, just the non test code. :heart: :pray:
This still does not do the fullrt client but it wasn't doing it before either.
Fixes smart dialler regression from #870.