keep-network / keep-core

The smart contracts and reference client behind the Keep network
https://keep.network
MIT License
119 stars 74 forks source link

Tweaks around libp2p pubsub seen messages cache #3773

Closed lukasz-zimnoch closed 8 months ago

lukasz-zimnoch commented 8 months ago

Refs: https://github.com/keep-network/keep-core/issues/3770 Depends on: https://github.com/keep-network/keep-core/pull/3771

Recent libp2p versions (we started to use them in https://github.com/keep-network/keep-core/pull/3771) introduced a way to set the seen messages cache TTL and strategy. Here we leverage those settings to reduce the excessive message flooding effect that sometimes occurs on mainnet. This pull request consists of two steps

Use longer TTL for pubsub seen messages cache

Once a message is received and validated, pubsub re-broadcasts it to other peers and puts it into the seen messages cache. This way, subsequent arrivals of the same message are not re-broadcasted unnecessarily. This mechanism is important for the network to avoid excessive message flooding. The default value used by libp2p is 2 minutes. However, Keep client messaging sessions are quite time-consuming so, we use a longer TTL of 5 minutes to reduce flooding risk even further. Worth noting that this time cannot be too long as the cache may grow excessively and impact memory consumption.

Use LastSeen as seen messages cache strategy

By default, the libp2p seen messages cache uses the FirstSeen strategy which expires an entry once TTL elapses from when it was added. This means that if a single message is being received frequently and consistently, pubsub will re-broadcast it every TTL, rather than never re-broadcasting it.

In the context of the Keep client which additionally uses app-level retransmissions, that often leads to a strong message amplification in the broadcast channel which causes a significant increase in the network load.

As the problem is quite common (see https://github.com/libp2p/go-libp2p-pubsub/issues/502), the libp2p team added a new LastSeen strategy which behaves differently. This strategy expires an entry once TTL elapses from the last time the message was touched by a cache write (Add) or read (Has) operation. That gives the desired behavior of never re-broadcasting a message that was already seen within the last TTL period. This reduces the risk of unintended over-amplification.