If the resolver is unable to serve resolution requests from proxies for a very long time, this will not necessarily cause a recovery, so the cluster can indefinitely remain in a state where commits fail (until a manual recovery is forced). Instead, the resolver should automatically detect that it is unhealthy, and trigger a recovery itself. We should also improve availability testing in simulation to ensure that this case is covered.
If the resolver is unable to serve resolution requests from proxies for a very long time, this will not necessarily cause a recovery, so the cluster can indefinitely remain in a state where commits fail (until a manual recovery is forced). Instead, the resolver should automatically detect that it is unhealthy, and trigger a recovery itself. We should also improve availability testing in simulation to ensure that this case is covered.