waku-org / go-waku

Go implementation of Waku v2 protocol
https://waku.org/
Other
117 stars 42 forks source link

Broadcaster blocked after publishing 1024 messages #185

Closed neekolas closed 2 years ago

neekolas commented 2 years ago

The problem

I've noticed some strange behaviour in the Broadcaster when trying to publish > 1056 messages (over any time period). I added some logging and did a load test, and I believe I've found the root cause.

On node startup, two subscriptions are created and immediately discarded (here and here). In the relay.SubscribeToTopic function that is called as part of subscription creation, the subscription channels are instantiated with a buffer size of 1024 and registered to the broadcaster.

Because nothing is reading from the subscription channels, after 1024 messages these will be completely saturated. This blocks broadcasting here. Once broadcasting is blocked, LibP2P will back up in delivering messages and you start to get messages being silently dropped with the warning subscriber too slow.

Steps to reproduce

  1. Start a waku node using the default topic, with relay enabled
  2. Publish >1056 messages (1024 + the default 32 message buffer for the LibP2P subscription)
  3. Check the logs and you will see subscriber too slow messages being emitted from LibP2P

Possible Solution

If we unregister these initial subscriptions from the Broadcaster the problem goes away and the node can handle tens of thousands of messages without congestion. https://github.com/xmtp/go-waku/blob/d8cab18a5be02bda971ea3c2f1d155b74aeff04a/waku/node.go#L238-L239

richard-ramos commented 2 years ago

Thank you very much for reporting this!