cockroachdb / docs

CockroachDB user documentation
https://cockroachlabs.com/docs
Creative Commons Attribution 4.0 International
188 stars 456 forks source link

Clarify hardware recs for throughput #4711

Closed rkruze closed 5 years ago

rkruze commented 5 years ago

Re: Cluster Topology Patterns

Issue Description

The following statement can be found on this page: Performance: Adding nodes for more processing power and/or storage typically increases throughput. For example, with five nodes and a replication factor of 3, each range has 3 replicas, with each replica on a different node. In this case, there will only be 1-2 replicas on each nod, leaving additional storage and bandwidth available.

Suggested Resolution

The last statement is not clear on what it is trying to convey. Maybe something like:

Performance: CockroachDB is a horizontally scalable database in which adding nodes increases the overall throughput of the system. For example, when an additional node is added to the cluster CockroachDB will automatically balance out the ranges from the pre-existing nodes to the new nodes to take advantage of the additional throughput and capacity of the new node.

jseldess commented 5 years ago

@rkruze, the reference docs have changed significantly in the meantime. I'm wondering, however, if we should update the second bullet here: https://www.cockroachlabs.com/docs/v19.1/recommended-production-settings.html#cpu-and-memory

To add more processing power (up to 16 vCPUs), adding more vCPUs is better than adding more RAM. Otherwise, add more nodes rather than using higher vCPUs per node; higher vCPUs will have NUMA(non-uniform memory access) implications. Our internal testing results indicate this is the sweet spot for OLTP workloads. It is a best practice to use uniform nodes so SQL performance is consistent.

For increased throughput, when is increasing vCPUs better vs increasing nodes?

rkruze commented 5 years ago

That paragraph is trying to convey a fair bit of information. Maybe we can restate that as the following:

To optimize for throughput, we recommend using larger nodes, up to 16 vCPUs, and at least 64GB of RAM. Going with larger nodes might have NUMA implications. We have found that using 16 vCPUs is the sweet spot for OLTP workloads. If more throughput is needed, then the recommendation would be to add more nodes to the cluster. It is also a best practice to use uniform nodes, so SQL performance is consistent.