koinos / koinos-mempool

The mempool microservice stores pending transactions for inclusion in blocks by the block producer.
MIT License
0 stars 1 forks source link

[BUG]: Old transactions can cause p2p to get stuck #64

Closed mvandeberg closed 2 years ago

mvandeberg commented 2 years ago

Is there an existing issue for this?

Current behavior

Over the weekend, seed.koinos.io got stuck. The reason why it got stuck is unclear, but what was clear was why it was unable to recover.

The p2p node had a few other peers who, presumably, are also stuck. The node would reconnect to a peer ahead of the node, but because the majority of connected peers were also stuck, gossip remained enabled. The node would then get spammed by pending transactions, all of which would fail because the mempool was full and those addresses had no more mana remaining according to the mempool. This would cause the node to disconnect before it could receive even a single sync block. The pattern would continue while continuing to disconnect from all peers.

While a fairly niche issue, the node quickly recovered by restarting the p2p and mempool microservices. The node was able to connect to enough peers to disable gossip and begin syncing again.

Expected behavior

The node should be able to recover and a failure in the node should not cause unrecoverable scenarios.

Possible solutions are:

Expire old transactions in the mempool by wall clock time.

Turn off p2p gossip when the head block is older than a certain age.

Steps to reproduce

No response

Environment

- OS:

Anything else?

No response