Use libp2p for p2p communication

Overview

Libp2p is a modular stack for writing p2p applications. The lowest level is composed from raw connections, security extensions and multiplexing. On top of that multiple protocols, that are commonly used in p2p applications, were developed.

Gossipsub is a broadcast protocol, that is using combination of eager push/lazy pull model and supports security extensions. For short introduction consider watching video. Or read research article.

Kademlia DHT is used in IPFS for routing, and in multiple projects as a discovery method.

Several extensions that natively support NAT traversal that go beyond port mapping already exist and more are planned in the near future.

Goals and motivation

use scalable and optimized protocol

libp2p and gossipsub in particular is used in polkadot and filecoin mainnets, multiple eth2 testnets, ipfs, ethereum swarm and other decentralized software projects.

in a technical evaluation report you can learn more about how it deals with adversarial behaviour, while also keeping the latency low.

lower maintenance cost and lower barrier for new spacemesh clients

high quality implementations are available in rust, js, nim and go. java and python implementation are being developed. beside implementations the protocol is well documented and specced. the developments in other languages are supported by protocol labs.

for existing client switching to go-libp2p reduces technical debt and lowers maintenace effort.

also libp2p is nicely modularized. replacing parts of the protocol with different implementations is trivial if it will be necessary. e.g replacing low level transport from tcp to quic, replacing one broadcast implementation with another, or running different broadcast protocols for different topics.

improve decentalization due to the lower bandwidth requirements

gossipsub is parametrized to support latency/bandwidth tuning. Note that in exteme case, it is possible to eliminate almost all duplicates by disabling eager push.

in the appendix of the research article you can see comparison for eth1, bitcoin and gossipsub bandwidth cost using simulations. Note that gossipsub is using high degree for eager push mesh connectivity D=12, which is excessive based on recomendations.

Proposed implementation

This section goes over the protocols that will be implemented on top of libp2p stack. Most of them exist currently.

syncfetcher/1.0.0

Each request will be a separate libp2p stream. To simplify integration with go-spacemesh exising interface with callbacks will be provided.

Later can be refactored into multiple distinct protocols, e.g fetch_block, fetch_atx and use default stream dispatch without including type of the request in the payload.

Implementation

To work with direct messages libp2p provides a stream that implements io.ReadWriter interface with an extension to set a read/write deadlines. The stream is multiplexed over a secure connection and is cheap to create.

Each request will be encoded using ssz and prefixed with uvarint encoding. So the whole body for the request is uvarint(request_lth) || request_bytes.

For (u)varint encoding we are using LEB128 that is also implemented in go std library.

timesync/1.0.0

Protocol to exchange local timestamps. Will use the same interface created for syncfetcher/1.0.0.

discovery/1.0.0

Replacement for current discovery. Will use Kademlia DHT.

In general we will keep the existing logic for the bootsrap and discovery. At the start node connects with bootstrap nodes. From them it learns about new peers and tries to connect with targeted outbound number of peers. Once outbound target is reached connections with bootstrap nodes will be closed.

Kademlia bootstrap process follows a common practice to generate a random node id and search for it, peers that are encounted during the search are inserted into the local table.

handshake/1.0.0

New protocol.

Application specific handshake. Executed after connection was established with a peer through discovery or direct method. We will validate that the peer is from the same network, supports required protocols, exchange any other useful information.

Implementation

Executed after libp2p emits event EvtPeerIdentificationCompleted. Once event is received every party initializes a p2p stream. Encodes a Request message using ssz, and writes uvarint(handshake_lth) || handshake_bytes to the stream. Protocols are already exchanged once identification event was received and we can validate as a part of the handshake validation. In existing implementation we consider syncfetcher, timesync and gossipsub as mandatory and they should be implemented by every node in the network.

Other side receives data from stream and responds with serialized Response. Response is non-optional only if handshake is succesfull. If handshake is not succesful connection must be closed, with optionally writing a reason for closing a connection.

After handshake is completed implementation must emit event EvtSpacemeshPeerAdded with new peer. Peers tracker will listen for this event.

type Request struct {
    Network uint32
    GenesisTime uint64
    // TODO (other fields to check)
}

type Response struct {
    Error string
}

spacegossip/1.0.0

spacegossip is a broadcast protocol for topics:

blocks
transactions
atxs
poet proofs
hare messages
tortoise beacon and weakcoin messages

Internally it will use gossipsub v1.1.

Validation

Gossipsub supports validation before broadcasting message. Existing interface will be changed only slightly. Validation needs to be executed as a separate function, before processing the message. It will require minor refactoring in the application.

Validation functions can be chained, so what we do currently with a check that the node is synced will be naturally supported using chained validation.

Priority-based delivery

Priority-based delivery is not supported in gossipsub. It won't be very important due to:

lower bandwidth requirements (3-5x if eager push mesh connectivity (D) is inbetween 8-4)
and fine grained control over traffic in the network using scoring function and rate limiting

If we notice that our bandwidth requirement is too high for smeshers that are running in home environments we still can do several things:

set D to a lower value, it will increase latency but decrease bandwidth.
time box delivery of the specific messages. during the window certain message type is supposed to be broadcasted. it will be ignored unless it was broadcasted during specific window. One example is about blocks and hare messages, 30s after layer start (hare wakeup parameter) we can accept only blocks, next window is hare messages and everything else can be broadcasted after that.
priority based delivery in the gossipsub. it is not a complex change, but will take time if we will want to do it properly and upstream it. We will need to change this part and eager push to order messages accordingly to the topic priorities.

Rate limiting and peers blacklisting

Gossipsub relies on scoring function to keep quality of peers high in the mesh. Scoring function in gossipsub v1.1 is composed from different factors:

peer time in mesh (longer connections are prioritized)
first message delivery counter
delivery rate (for example 50 blocks per 200s/layer)
message validity
ip address collocation factor (negative factor if multiple peers are using same ip)
application specific factors

Not all factors must be enabled, in fact all can disabled. But we will make use of them for https://github.com/spacemeshos/SMIPS/issues/56.

Secure connection

Libp2p supports several protocols for securing connection, two that are recommended:

noise or on website
mTLS

We can support both with priority towards noise and leave an option for a client to use whichever, or use only noise for simplicity.

Implementation plan

Replacement

The goal of this part is to replace current p2p module using tools and methods discussed above. This is mostly about writing integration and initialization code for libp2p following best practice, sub-1000 LOC with tests.

5-8 days, most of this time to test it on a live network with destructive experiments.

Instrumentation

Metrics to continuously analyze per topic broadcast requirements, performance and general bandwidth requirements.

We already have several metrics to track peer count and traffic. Libp2p interface provides support for plugable tracers that will be used to collect information for observability. 2-3 days.

Safety

Integration of the scoring function and peers blacklisting. Can be done as a part of blacklisting SMIP.

Will require correct estimation of topic weight and message delivery rates. 2-3 days.

Questions

Dependencies and interactions

Simplifies work for https://github.com/spacemeshos/SMIPS/issues/56. Might have some interactions with https://github.com/spacemeshos/SMIPS/issues/60.

Stakeholders

Testing and performance

Testing will be done as a part of initial implementation. For destructive experiments chaos mesh tooling will be used. It can be deployed on any k8s cluster with one line.

Also note that libp2p is a stable software and during the integration there will be no problems, that weren't solved before.

Links

@y0sher @moshababo @lrettig

Everything seems good, this was out plan initially but it didn't went well and we created our own impl, I had this idea once I realized libp2p had matured (it wasn't when we started..) but we didn't allocate time for it.

some points that might be a problem:

@iddo333 eagerly resists using Kademlia DHT discovery ( I think you can use other discovery method with libp2p but none is implemented)
I'm still concerned with Priority-based delivery this might be critical for us, not sure about time-boxed messages as clocks might be lightly drifted.
Validation and blacklisting based on application-specific reason are a must for us and we should make sure it is supported.

@iddo333 eagerly resists using Kademlia DHT discovery ( I think you can use other discovery method with libp2p but none is implemented)

what were the reasons? anything that can find connections can be used really. but afaik dht based discovery is running in mature networks (e.g. ethereum discovery v4, discovery v5 that will be used in eth2 are also based on kademlia), filecoin and ipfs obviously.

in the essence, we still requesting peers like it is done in spacemesh or bitcoin, but we are requesting specifically peers close to some random address. would love to know more details about what is wrong with using dht for discovery.

I'm still concerned with Priority-based delivery this might be critical for us, not sure about time-boxed messages as clocks might be lightly drifted.

it can be critical only if protocol will require higher bandwidth than will be available to the common client (maybe even only for the clients that are running from home network?), right? so let's say if we lower the protocol requirement from 3 MB/s to 1MB/s it should do more for network health. and with rate-limiting we can target specific bandwidth limit. I suspect it will be mostly about supported tx rate.

besides that in current implementation prioritization happens before message is written to the large queue (250 msgs if i remember right), so if we have congestion there is no strong guarantee on the specific message order.

as for the timing concerns, we do rely on them everywhere. e.g. if tortoise beacon messages are not delivered on time they are useless, i suspect the same is true for hare. so the window can account for light clock difference, e.g. instead of 30s we can start window earlier by 5s and close it later by 5s, making it 40s in total. but i don't think that it will even come to this based on the data from other networks.

Validation and blacklisting based on application-specific reason are a must for us and we should make sure it is supported.

This is how blocks are validated in lotus . Some things will happen automatically based on the response in Validate , e.g. Ignore for late blocks, Reject for the invalid block. And some things will have to be done manually, like blacklisting IP address for example.

I don't have any blanket objections to using a DHT, but I'd like to hear if there are specific vulnerabilities @iddo333 is worried about.

More generally, the peer discovery algorithm needs to be very carefully vetted, as it has security implications. Our goal is to get a secure bootstrap assuming one honest bootstrap peer. However, I can think of some plausible implementations of DHT bootstrap that fit the description above, but would allow an adversary to potentially eclipse a target node even if all its bootstrap peers are honest.

For example, if the search in the DHT reveals the "random address" that's being sought, it might allow an adversary to quickly "pack" the DHT around that address, and thus eclipse (or nearly eclipse the newly joining node). Whether or not this is possible really depends on the details of the DHT implementation (and I don't know enough about the Kademlia bootstrap process to say if it applies there).

But my point isn't that I'm worried about this particular problem, it's that seemingly innocuous details (from an operational standpoint) can make a big difference from a security standpoint.

The bootstrap process in this dht works as follows: node generates a random peer ID and runs a FIND_NODE query. The query is initialized with K peers from local routing table that are closest to the random peer ID (at the start those are trusted bootstrap nodes from the config). From K peers it selects subset limited by the concurrency factor and sends FIND_NODE(RANDOM_PEER_ID) query to each one concurrently. Each query returns up to K peers from their own local table, which will be used in the following iterations of this run. Single process loops until responses from K closest peers were received, or until no more progress can be made. The process may do multiple queries with distinct random peer IDs. And the process can be repeated either periodically or on demand.

If one of the bootstrap nodes is malicious it may pack identities close to queried peer ID which will definitely affect the discovery process. Maybe there are some details in this dht implementation that i missed, but i definitely see how it can be a concern.

I will do a bit more reading cause the implementation follows some of the security extensions from S/Kademlia: A practicable approach towards secure key-based routing, but in general we don't have to use it and can adapt our current peer exchange protocol to a different transport without too much effort.

If we notice that our bandwidth requirement is too high for smeshers that are running in home environments we still can do several things:

set D to a lower value, it will increase latency but decrease bandwidth.

Can D be set differently per-node or is this a universal parameter?

Secure connection

Libp2p supports several protocols for securing connection, two that are recommended:

Why is this important for us, at this stage?

I'm still concerned with Priority-based delivery this might be critical for us, not sure about time-boxed messages as clocks might be lightly drifted.

it can be critical only if protocol will require higher bandwidth than will be available to the common client (maybe even only for the clients that are running from home network?)... in current implementation prioritization happens before message is written to the large queue (250 msgs if i remember right), so if we have congestion there is no strong guarantee on the specific message order.

To some extent we can mitigate this at the protocol layer but ensuring that our clients prioritize sending the most important messages (e.g., publishing new blocks, and hare votes).

I agree with @y0sher, this is potentially a very serious and important feature - something very similar to this is what caused the recent Solana crash.

spacemeshos / SMIPS