cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.9k stars 3.78k forks source link

bulk: audit existing cluster settings and update documentation #87356

Open msbutler opened 2 years ago

msbutler commented 2 years ago

Bulk jobs interact with many tunable cluster settings. Some of these have public and/or internal advice to tune them. This documentation may be outdated and should be audited and updated. Further, some cluster settings may need to be set to private or removed all together. Below is an attempt to list all tunable cluster settings the DR team should consider auditing (at least for 22.2):


Notes from Matt:

Cluster settings abound!

I think bool settings are a good place to start. Check these searches:

Rough criteria are:

We do want to keep settings around for new functionality that needs maturing (i.e. feature flags), or known cases of a client needing to do things differently than default.

Let’s call out / debate individual settings as comments.

Jira issue: CRDB-19292

blathers-crl[bot] commented 2 years ago

cc @cockroachdb/bulk-io

shermanCRL commented 2 years ago

(Out of scope but I’d love to see a Docs page for every setting -- why you would use it, how it interacts with other settings, risks & trade-offs. cc @kathancox)

msbutler commented 1 year ago

fwiw, I just ran our restore tpccInc roachtest on 23.1. i.e.: "RESTORE DATABASE tpcc FROM '/2022/09/07-000000.00' IN 'gs://cockroach-fixtures/tpcc-incrementals-22.2?AUTH=implicit' AS OF SYSTEM TIME '2022-09-07 12:15:00' WITH detached"

On a cluster with the following topology: roachprod create $CLUSTER -n 4 --gce-machine-type="n1-standard-8" --gce-pd-volume-size=1000 --local-ssd=false

and increasing kv.bulk_io_write.concurrent_addsstable_requests and kv.bulk_io_write.restore_node_concurrency from 1 to 5 had no measurable effect on throughput.