Open rmloveland opened 2 years ago
Hi @thtruo @mwang1026 , do we consider this more of an Observability or a Disaster Recovery issue?
cc @kathancox
I actually consider this more OX since OX owns network and behavior in light of network issues. cc @piyush-singh :)
Happy to take this offline too given that so much of the solution is visibility into partitions (and to avoid a hot potato problem)
for the record, this is "resilience" IMO and no "disaster" (my personal opinion)
Agreed, I think this should live with the docs on the network latency matrix or similar where we describe how to identify a network partition. If we want to have a dedicated page on network partitions, we could split that out, but it should still be either Observability or Deployments and Ops (DB Server) in my opinion.
Richard Loveland (rmloveland) commented:
Per this forum conversation, the behavior is as follows:
We should probably update the network partition troubleshooting docs with some of this info. It could also perhaps link to/from the bounded staleness reads stuff
Jira Issue: DOC-1140