waku-org / research

Waku Protocol Research
MIT License
3 stars 0 forks source link

Decentralized Bootstrapping #69

Open SionoiS opened 7 months ago

SionoiS commented 7 months ago

Bootstrapping

Who do I connect to if I don't know anyone already in the network?

IPFS (Amino DHT)

Hard-coded list of nodes in multiadress format then use DNS to access the addresses.

Ethereum (DNSDISC)

Instead of a hard-coded list, create a Merkle tree of all the addresses (ENRs) with a signature then use DNS to make the list available.

For the technical details see here.

Attack vectors

Not really common but possible nonetheless.

Denial of Service

Make the bootstrapping nodes inaccessible so that the network can't grow.

Simple and effective attack on small or unprepared networks. Fortunately mitigation is also easy to implement, have nodes remember other nodes they connected to. This reduce the reliance on bootstrap nodes to only absolutely new nodes who never connected. Furthermore, making connecting to any node as user friendly as possible can help. Using node addresses that are readable, shareable, error-correcting and can standalone is advised.

Sybil Attack

Connecting to malicious peers can allow attacker to gain power over your own nodes or gather more information than intended.

A bit trickier to defend against. The questions becomes, do you trust the bootstrapping nodes you're using? Should you use a web of trust or federation? The maintainers of the software could just use a centralized system but this system becomes an attack vector in itself, which brings me to.

Decentralized Bootstrapping

Decentralized bootstrapping is still an unsolved problem in p2p network. Various network use mostly the same system. Maintain a list of nodes to bootstrap from (and run those nodes as well). Can we built a practical, trustless and decentralized system? No! But we can design a good enough system, inspired by what is in use currently.

The Ethereum DNSDISC spec. already hint that what's possible. Aggregating multiple bootstrapping lists reduce the reliance on a central list. Multiple aggregators can each keep their own view of which boot nodes should be trusted. By combining the two we create a resilient system. There's is also a social aspect but it's out of scope.

The Plan

How do we build this?

Phase 1

It all start with each nodes having a DB of peer information to make informed decision on who to connect to first. The minimum to store is the multiaddresses but adding latency and "uptime" is even better. A synthetic "reliability" metric could be devised but is not necessary. The default should be to store all peer data. Boot nodes should also reject repeated calls from the same IP address.

Phase 2

Any system that allow sharing peer information can work but ensuring the integrity and authenticity of each node list is important. Each list creator should sign their list and use the same identity through time to facilitate social reputation building. Building tools for list creation and management would ensure the process is safe and easy to use.

In the short term, using the already designed DNSDISC is advised but we should start using link subtrees as soon as we hear of another list created by the community.

We should revisit this issue if we stop using Discv5.

jm-clius commented 6 months ago

Thanks for this. I think this should be a prioritised focus for Waku Research in 2024. If we can at least create and open-source something like a crawler-publisher, that crawls the DHT, creates and updates Merkle tree node lists, signs the list and publish it to a domain, we'll be able to: 1) extend our own published bootstrap node list with community nodes 2) allow multiple "competing" (or alternative) lists to be published quite easily

SionoiS commented 6 months ago
  1. extend our own published bootstrap node list with community nodes

Is it wise to add our "approval" to random nodes found in the DHT? I feel like the list should be highly curated.

  1. allow multiple "competing" (or alternative) lists to be published quite easily

:100: a must for decentralization.

SionoiS commented 5 months ago

I just came across DGAs, might be useful.