cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.03k stars 3.8k forks source link

kvobs: support a SHOW REPLICAS command #84090

Open andy-kimball opened 2 years ago

andy-kimball commented 2 years ago

Today, it's difficult for SQL-level developers to determine where replicas exist for each of their databases/tables/indexes/row spans. They want to know how many replicas exist and where those replicas are placed. They want to understand which replicas can always serve consistent reads (i.e. the leaseholder replica), which replicas can usually serve consistent reads (i.e. any replica in the case of a GLOBAL table), and which replicas can only serve stale reads. They want to understand which replicas are used for writes (i.e. voting replicas). These concerns are especially important for any developer building a multi-region app.

To get this information today, developers need to drop down to the SHOW RANGES command, which is problematic:

  1. It references concepts like nodes and stores, which are lower-level KV concepts that only operators concern themselves with.
  2. It references concepts like ranges, which are higher-level than nodes and stores, but still unfamiliar to an average SQL developer.
  3. SHOW RANGES is not supported in CRDB Serverless.

I'd suggest CRDB exposes a new SHOW REPLICAS command, which shows information that's very similar to what SHOW RANGES shows today, except that it's expressed in terms of databases/tables/indexes/row spans rather than nodes/stores/ranges. The name is TBD, SHOW REPLICAS is just a suggestion. In addition, the columns we return would need to be carefully considered so that they make sense to SQL developers who need to care about the location of their data.

Jira issue: CRDB-17456

andy-kimball commented 2 years ago

CC @knz

knz commented 2 years ago

Dependency to this work from INI-213: https://cockroachlabs.atlassian.net/browse/CRDB-14522

andreimatei commented 2 years ago

Consider whether building on the existing replication reports is direction to go in. That's where I would start.

knz commented 2 years ago

Replications reports do not work under multi-tenancy and are currently being considered for removal by PM. image

andy-kimball commented 2 years ago

Replication reports have information that is intended for cluster operators rather than app developers. Information like over or under-replicated ranges, critical localities, and so on, are useful for whoever's monitoring the host cluster. That would be us in the case of Serverless; I don't see any reason to expose this information to guest tenants.

@knz, why would we be considering removing these from the system tenant? Physical host cluster operators should be able to monitor the health/status of all nodes/ranges/etc, across all tenants, in one place.

knz commented 2 years ago

Physical host cluster operators should be able to monitor the health/status of all nodes/ranges/etc, across all tenants, in one place.

I don't disagree but even this does not work currently.

This problem is being tracked by @mwang1026 currently. See the latest technical notes on this topic here: https://cockroachlabs.atlassian.net/wiki/spaces/MEVT/pages/2700542028/2022-08-24+MTU-WG+Discussion+notes