DNS-based Discovery: Merkle tree implementation

jm-clius commented 3 years ago

Subtask of #452 Created as partial fulfilment of #552

Problem

vacp2p/rfc#385 proposes a method to discover a bootstrap list of Waku v2 peers via DNS.

The peer list is encoded as a Merkle tree. EIP-1459 specifies the URL scheme to refer to such a DNS node list. The proposal uses the same approach, but with a matree scheme.

The Merkle tree implementation forms the core of the DNS-based discovery mechanism. It must be able to create and access the Merkle tree encoded as above, including:

parsing each entry type
traversing a tree and retrieving all leaf nodes
validating hashes
etc.

arnetheduck commented 3 years ago

Why not go with ENR like the EIP does? it's already used in eth2 this way

arnetheduck commented 3 years ago

Also, for inspiration, https://github.com/status-im/nimbus-eth2/blob/stable/beacon_chain/ssz/merkleization.nim might be interesting

jm-clius commented 3 years ago

Why not go with ENR like the EIP does? it's already used in eth2 this way

Yeah, you're right, should probably have had a wider discussion about this. Initially it was because we narrowed the scope of the problem to something libp2p specific - i.e. discover a list of libp2p peer multiaddrs without reference to anything outside the libp2p definition space. Waku v2 currently also only uses multiaddr. This seems to have been a discussion before, though it's unclear if any consensus was reached.

@oskarth @staheri14 any thoughts/opinions on using ENR rather than multiaddr for discovery?

jm-clius commented 3 years ago

Also, for inspiration, https://github.com/status-im/nimbus-eth2/blob/stable/beacon_chain/ssz/merkleization.nim might be interesting

Thanks!

arnetheduck commented 3 years ago

Generally, you can turn an ENR into a multiaddr, but not the way around (due to the mandatory signature) - it's also used exclusively in discv5 - I suspect life might be easier if we stick to the spec - or support both - in particular, ENR can signal capabilities which multiaddr can't, afair

dryajov commented 3 years ago

Generally, you can turn an ENR into a multiaddr, but not the way around (due to the mandatory signature) - it's also used exclusively in discv5 - I suspect life might be easier if we stick to the spec - or support both - in particular, ENR can signal capabilities which multiaddr can't, afair

multiaddrs are recursive, self contained structures, so they can encapsulate almost any sort of information described by another multiformat. I don't remember if there is anything that enr does better than multiaddr.

oskarth commented 3 years ago

I haven't looked into this in detail and don't have too strong opinions on it. One point though: ENR seems more Ethereum specific, whereas multiaddr seems more general. The latter seems to make more sense conceptually for Waku as it isn't (only) Ethereum specific.

That said, if there's a clear mapping and we can support both, then starting with ENR might make sense in the interest of code re-use and delivery something useful faster. Then expand/enable multiaddr as an option.

I don't see how capabilities are different in ENR vs multiaddr? From a cursory look they seem similar in terms of k,v pairs.

arnetheduck commented 3 years ago

https://consensys.net/diligence/blog/2020/09/libp2p-multiaddr-enode-enr/

ie enr is signed by default which has important security implications and has facilities for arbitrary key-values - this is why you can create a multiaddr from an enr but not the other way around - multiaddr is just an address - it doesn't say "this node supports the light client protocol" - at least not in the way it's normally intended to be used, even though I'm sure it's possible to cram something like that into it.

the broader point here is that we already need ENR-based dns lookup for other reasons - once we have that , we can come up with yet another standard that I'm sure has lots of important advantages.

jm-clius commented 3 years ago

Thanks for your comments, everyone.

Implementation-wise there is minimal difference, so it makes sense to me to do "pure" EIP-1459 using ENR.

dryajov commented 3 years ago

I just realized that you're talking about ENRs which are conceptually different; a multiaddr is closer to an enode (e.g. enode://00029fb539bbbebc7bcc986bca2b1d3e262a1133901c3cc699f8dd9cba91df51ede5fed9c2c25b74425d64344a9a9d393904c6f0f8bd95cc0c5e2699b6a19ea1@47.94.142.31:53454).

Use https://github.com/libp2p/specs/blob/master/RFC/0003-routing-records.md if you want an ENR like structure in the libp2p world. For the spec, you probably want to stick with ENR since that's what the spec mandates and I don't see any benefit in deviating from it.

waku-org / nwaku

DNS-based Discovery: Merkle tree implementation #616

Problem