cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.04k stars 3.79k forks source link

import: avoid over-spliting schema objects #99274

Open ZhouXing19 opened 1 year ago

ZhouXing19 commented 1 year ago

Related ticket (internal) Related convo (internal)

We should avoid over-splitting schema objects, especially for small datasets measured in storage bytes, even if the splits are short-lived. It’s a stability risk — these short-lived ranges are all non-quiescent so consume CPU like we saw above. For a cluster provisioned with some finite # of vCPUs, it’s very easy then to destabilize itself, vCPUs that would’ve been sufficient in the steady-state post-import+1h. Once https://github.com/cockroachdb/cockroach/pull/98820 lands early 23.2, this “oversplitting” will show up more prominently and cause outages.

Jira issue: CRDB-25799

gz#16454

mgartner commented 1 year ago

Next steps: some one SQL Queries can look into this and see if it's absolutely necessary for 23.2.

michae2 commented 1 year ago

Foundations team: is this absolutely necessary for 23.2?

rafiss commented 1 year ago

This is not necessary from our side, so I'll leave the prioritization to you.

michae2 commented 10 months ago

[triage meeting] @ZhouXing19 we're having trouble understanding the risk here. Does this mean that we expect imports in 23.2 to use more CPU than they did before?