Closed jnels124 closed 4 months ago
This is not something that will present a problem. The issue will only occur if coordinator replicas are taken offline for an extended period of time to where WAL files for the replicated timeline have been removed. We should expect to have to rebuild replicas if we ever take them down for an extended period of time.
Description
There are several occasions where the replicas in the cluster have stopped replicating. It is sure to happen if leaving the replicas down and performing inserts/updates and then bringing the replicas back up. I have seen the same error occur at least once while the replicas were never taken down.
We may be able to fix this issue by configuring wal_keep_segments but there may be additional issues at the stackgres/patroni layer.
Steps to reproduce
Additional context
No response
Hedera network
other
Version
0.90-SNAPSHOT
Operating system
None