Replicas In Stackgres Cluster Randomly Stop Replicating

hashgraph / hedera-mirror-node

Hedera Mirror Node archives data from consensus nodes and serves it via an API

Apache License 2.0

145 stars 111 forks source link

Description

There are several occasions where the replicas in the cluster have stopped replicating. It is sure to happen if leaving the replicas down and performing inserts/updates and then bringing the replicas back up. I have seen the same error occur at least once while the replicas were never taken down.

We may be able to fix this issue by configuring wal_keep_segments but there may be additional issues at the stackgres/patroni layer.

Steps to reproduce

bring replicas down and perform inserts.
bring replicas up after a reasonable amount of time
Bring replicas backup and notice the errors in the patroni container

Additional context

No response

Hedera network

other

Version

0.90-SNAPSHOT

Operating system

None

hashgraph / hedera-mirror-node