Open awoods187 opened 5 years ago
cc @bdarnell for any additional thoughts
These are emergency settings for a particular customer's idiosyncratic setup; we don't want to recommend these across the board (at least the GOMAXPROCS one). And we're changing the snapshot rate defaults in 19.1 so we may not need any work there.
tl;dr: Can we close this issue?
@johnrk (or Andy or Ben), does this docs issue still need to exist in the v20.1+ world? I notice that 3/4 of the CRDB issues are still open, but I don't know what that means for this docs issue (if anything).
Ben, based on what you said, it seems like things got "better" with snapshot rates in the time since this issue was filed, so maybe docs for this are no longer needed?
I notice that 3/4 of the CRDB issues are still open
I just closed another one, but the linked issues are really about possible product improvements/issues; they don't really have any bearing on what we can or should document right now. (A more relevant issue for docs is https://github.com/cockroachdb/cockroach/issues/39200. We want to combine the recovery and rebalance settings, so whenever we give advice about one we should do the same for the other)
For the specific docs suggestions:
It's still sometimes useful to increase these parameters, so we should document when and why. And whenever we change one, we should change the other (our TPC-C docs currently recommend only changing the "rebalance" setting, while our known issues page suggests only changing the "recovery" one).
OK thanks Ben. Keeping this one on the TODO list then
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB docs!
Andrew Woods (awoods187) commented:
From a recent customer conversation, we identified a need to tune load-based rebalancing. We have a number of tracking issues for this including the main tracking issue:
We have identified various other secondary issues like:
We should document the knobs we do have now and the impact they can have depending upon the setting.
set cluster setting kv.snapshot_rebalance_max_rate='8MiB'
set cluster setting kv.snapshot_recovery.max_rate='32MiB'
These can allow users to support more load before encountering any degradation.
Jira Issue: DOC-245