Closed benofben closed 9 years ago
We might want to use rpc_interface and listen_interface instead of the *_address versions.
I'd also want to hear the justification for turning off hinted handoff.
I would set num tokens to 32 as that's a more commonly seen setting, but yes, good point about vnodes.
It sounds like the phi-convict_threshold: 12 is to account for potentially laggy networks in cloud providers.
vnodes are off by default. If you turn them on, it seems the default number of vnodes is 256. A couple questions: (1) If they are off by default, why do we want them on by default here? If we do, does it make sense to turn vnodes on by default in the core product? (2) Chuck mentioned that each vnode used to add significant overhead, but that has lessened in newer releases. We were speculated that might be moving vnodes from a heavier to lighter weight thread model or something similar. Is there any reason not to use the 256 default with the product as it is today?
In main I've made the snitch change. The format for the path is different than the simple snitch it replaced, so fingers crossed that works.
There does not seem to be an rpc_address element in the config. I'm unclear on the broadcast_rpc_address config. rpc_interface is not present in the config either.
None of these values are present in the config: phi_convict_threshold: 12 num_tokens: 256 (default value) initial_token (to remove)
Going to table the hinted handoff question for now. Will revisit the config once we get multi DC working with the revised snitch.
Here's an example config generated by our script. OpsCenter then set the seeds to 10.(1,2,3).1.5 (the first node in each dc).
Curious to understand what exactly needs to be changed to: (a) set phi_convict_threshold (b) turn on vnodes (c) make any other changes required for multidc config
{
"accepted_fingerprints": {
"10.1.1.10": "2048 a9:0a:7a:d5:6e:2e:63:c2:bb:f6:9f:d1:13:3c:39:6f (RSA)",
"10.1.1.11": "2048 2b:70:6a:bc:e5:51:16:b4:09:59:30:b8:12:68:76:e4 (RSA)",
"10.1.1.12": "2048 fb:72:8d:4c:95:62:cf:fa:78:55:1c:40:a2:78:cb:b0 (RSA)",
"10.1.1.13": "2048 b8:18:d7:9a:df:6f:d8:03:cc:e5:56:4c:2f:f4:d7:e9 (RSA)",
"10.1.1.14": "2048 60:fc:ea:81:95:6f:16:0d:0b:d3:c6:06:4f:af:15:75 (RSA)",
"10.1.1.5": "2048 ed:d9:28:2c:49:01:25:2d:5e:b7:9f:35:60:4a:f8:8b (RSA)",
"10.1.1.6": "2048 92:90:ec:c4:4a:38:ba:58:71:fb:98:ac:ea:30:43:c1 (RSA)",
"10.1.1.7": "2048 d2:af:5b:a3:9f:c5:e6:44:38:e1:1e:65:9f:a8:38:54 (RSA)",
"10.1.1.8": "2048 1e:5d:a2:33:d6:84:e9:2e:db:8f:bd:31:86:c9:7d:00 (RSA)",
"10.1.1.9": "2048 b8:a4:b1:69:0f:f7:ea:74:08:36:31:3f:09:1c:3f:67 (RSA)",
"10.2.1.10": "2048 98:1f:21:ba:4d:9f:70:fc:a4:b2:d6:64:05:9b:41:76 (RSA)",
"10.2.1.11": "2048 ff:f0:58:94:de:d3:e2:44:82:84:49:2e:c6:10:aa:52 (RSA)",
"10.2.1.12": "2048 95:68:df:fc:20:7d:c8:ee:a2:ff:94:0c:9c:98:c9:17 (RSA)",
"10.2.1.13": "2048 54:2b:34:83:d7:22:1e:0b:bd:4e:71:c6:d5:17:2f:ff (RSA)",
"10.2.1.14": "2048 7f:23:31:50:09:13:ac:8a:ae:9f:0a:fb:34:a7:9a:c1 (RSA)",
"10.2.1.5": "2048 d0:a4:6c:06:65:54:1f:dc:e3:a6:61:87:20:fa:b3:22 (RSA)",
"10.2.1.6": "2048 9c:5a:98:43:68:5c:72:74:b1:e5:a0:31:05:dd:02:a5 (RSA)",
"10.2.1.7": "2048 c9:fb:e8:6f:3e:fb:02:c2:35:91:5c:d7:15:ca:ac:1a (RSA)",
"10.2.1.8": "2048 e4:83:d6:1a:d1:6a:b8:55:f2:1e:bc:84:6a:92:fe:94 (RSA)",
"10.2.1.9": "2048 c8:01:69:b3:41:ce:f6:fc:69:c0:50:1e:91:27:40:a4 (RSA)",
"10.3.1.10": "2048 5b:85:00:b5:6f:ee:a0:cd:cb:00:d0:ce:71:19:84:0f (RSA)",
"10.3.1.11": "2048 8b:0c:b3:c6:1d:49:fc:48:93:d3:86:19:81:dc:38:83 (RSA)",
"10.3.1.12": "2048 5e:7f:6d:22:fd:2e:f2:91:88:77:4f:0a:96:d3:22:af (RSA)",
"10.3.1.13": "2048 3b:55:62:b5:c3:2d:5a:ff:8e:75:62:f9:95:c7:f8:00 (RSA)",
"10.3.1.14": "2048 db:3a:48:ff:dd:e8:ce:2b:99:77:24:d3:79:a5:0f:30 (RSA)",
"10.3.1.5": "2048 c9:1a:71:f2:2c:0b:33:07:83:00:99:fa:6f:bf:0c:22 (RSA)",
"10.3.1.6": "2048 02:5a:38:f3:54:dd:ba:70:7f:3f:46:26:f4:c3:29:d5 (RSA)",
"10.3.1.7": "2048 c2:9e:a4:59:5a:00:db:dd:e5:17:9a:52:78:0f:c9:4e (RSA)",
"10.3.1.8": "2048 46:fc:8b:9a:3f:d2:2f:3b:b0:8d:64:f0:f3:00:4f:d6 (RSA)",
"10.3.1.9": "2048 24:3f:4a:f7:b2:1d:5a:6c:9a:e2:bf:f0:e0:44:47:04 (RSA)"
},
"cassandra_config": {
"authenticator": "AllowAllAuthenticator",
"authorizer": "AllowAllAuthorizer",
"auto_bootstrap": false,
"auto_snapshot": true,
"batch_size_warn_threshold_in_kb": 64,
"batchlog_replay_throttle_in_kb": 1024,
"cas_contention_timeout_in_ms": 1000,
"client_encryption_options": {
"algorithm": "SunX509",
"cipher_suites": [
"TLS_RSA_WITH_AES_128_CBC_SHA",
"TLS_RSA_WITH_AES_256_CBC_SHA",
"TLS_DHE_RSA_WITH_AES_128_CBC_SHA",
"TLS_DHE_RSA_WITH_AES_256_CBC_SHA",
"TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA",
"TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA"
],
"enabled": false,
"keystore": "resources/dse/conf/.keystore",
"keystore_password": "cassandra",
"protocol": "TLS",
"require_client_auth": false,
"store_type": "JKS",
"truststore": "resources/dse/conf/.truststore",
"truststore_password": "cassandra"
},
"cluster_name": "Test Cluster",
"column_index_size_in_kb": 64,
"commit_failure_policy": "stop",
"commitlog_directory": "/mnt/commitlog",
"commitlog_segment_size_in_mb": 32,
"commitlog_sync": "periodic",
"commitlog_sync_period_in_ms": 10000,
"commitlog_total_space_in_mb": 8192,
"compaction_throughput_mb_per_sec": 16,
"concurrent_counter_writes": 32,
"concurrent_reads": 32,
"concurrent_writes": 32,
"counter_cache_save_period": 7200,
"counter_write_request_timeout_in_ms": 5000,
"cross_node_timeout": false,
"data_file_directories": [
"/mnt/data"
],
"disk_failure_policy": "stop",
"dynamic_snitch_badness_threshold": 0.1,
"dynamic_snitch_reset_interval_in_ms": 600000,
"dynamic_snitch_update_interval_in_ms": 100,
"endpoint_snitch": "org.apache.cassandra.locator.GossipingPropertyFileSnitch",
"hinted_handoff_enabled": "true",
"hinted_handoff_throttle_in_kb": 1024,
"incremental_backups": false,
"index_summary_resize_interval_in_minutes": 60,
"inter_dc_tcp_nodelay": false,
"internode_authenticator": "org.apache.cassandra.auth.AllowAllInternodeAuthenticator",
"internode_compression": "dc",
"key_cache_save_period": 14400,
"max_hint_window_in_ms": 10800000,
"max_hints_delivery_threads": 2,
"memory_allocator": "NativeAllocator",
"memtable_allocation_type": "heap_buffers",
"memtable_heap_space_in_mb": 2048,
"memtable_offheap_space_in_mb": 2048,
"native_transport_max_frame_size_in_mb": 256,
"native_transport_max_threads": 128,
"native_transport_port": 9042,
"partitioner": "org.apache.cassandra.dht.Murmur3Partitioner",
"permissions_validity_in_ms": 2000,
"range_request_timeout_in_ms": 10000,
"read_request_timeout_in_ms": 5000,
"request_scheduler": "org.apache.cassandra.scheduler.NoScheduler",
"request_timeout_in_ms": 10000,
"row_cache_save_period": 0,
"row_cache_size_in_mb": 0,
"rpc_keepalive": true,
"rpc_port": 9160,
"rpc_server_type": "sync",
"saved_caches_directory": "/mnt/saved_caches",
"server_encryption_options": {
"algorithm": "SunX509",
"cipher_suites": [
"TLS_RSA_WITH_AES_128_CBC_SHA",
"TLS_RSA_WITH_AES_256_CBC_SHA",
"TLS_DHE_RSA_WITH_AES_128_CBC_SHA",
"TLS_DHE_RSA_WITH_AES_256_CBC_SHA",
"TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA",
"TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA"
],
"internode_encryption": "none",
"keystore": "conf/.keystore",
"keystore_password": "cassandra",
"protocol": "TLS",
"require_client_auth": false,
"store_type": "JKS",
"truststore": "conf/.truststore",
"truststore_password": "cassandra"
},
"snapshot_before_compaction": false,
"ssl_storage_port": 7001,
"sstable_preemptive_open_interval_in_mb": 50,
"start_native_transport": true,
"start_rpc": true,
"storage_port": 7000,
"stream_throughput_outbound_megabits_per_sec": 200,
"thrift_framed_transport_size_in_mb": 15,
"tombstone_failure_threshold": 100000,
"tombstone_warn_threshold": 1000,
"trickle_fsync": false,
"trickle_fsync_interval_in_kb": 10240,
"truncate_request_timeout_in_ms": 60000,
"write_request_timeout_in_ms": 2000
},
"install_params": {
"package": "dse",
"password": "asd!",
"private_key": "",
"repo-password": "asd",
"repo-user": "ben.lackey_datastax.com",
"username": "datastax",
"version": "4.7.1"
},
"is_retry": false,
"local_datacenters": [
{
"dc": "west_us",
"location": "West US",
"node_information": [
{
"node_type": "cassandra",
"private_ip": "10.1.1.5",
"public_ip": "10.1.1.5",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.1.1.6",
"public_ip": "10.1.1.6",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.1.1.7",
"public_ip": "10.1.1.7",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.1.1.8",
"public_ip": "10.1.1.8",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.1.1.9",
"public_ip": "10.1.1.9",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.1.1.10",
"public_ip": "10.1.1.10",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.1.1.11",
"public_ip": "10.1.1.11",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.1.1.12",
"public_ip": "10.1.1.12",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.1.1.13",
"public_ip": "10.1.1.13",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.1.1.14",
"public_ip": "10.1.1.14",
"rack": "rack1"
}
]
},
{
"dc": "north_europe",
"location": "North Europe",
"node_information": [
{
"node_type": "cassandra",
"private_ip": "10.2.1.5",
"public_ip": "10.2.1.5",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.2.1.6",
"public_ip": "10.2.1.6",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.2.1.7",
"public_ip": "10.2.1.7",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.2.1.8",
"public_ip": "10.2.1.8",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.2.1.9",
"public_ip": "10.2.1.9",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.2.1.10",
"public_ip": "10.2.1.10",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.2.1.11",
"public_ip": "10.2.1.11",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.2.1.12",
"public_ip": "10.2.1.12",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.2.1.13",
"public_ip": "10.2.1.13",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.2.1.14",
"public_ip": "10.2.1.14",
"rack": "rack1"
}
]
},
{
"dc": "east_asia",
"location": "East Asia",
"node_information": [
{
"node_type": "cassandra",
"private_ip": "10.3.1.5",
"public_ip": "10.3.1.5",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.3.1.6",
"public_ip": "10.3.1.6",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.3.1.7",
"public_ip": "10.3.1.7",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.3.1.8",
"public_ip": "10.3.1.8",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.3.1.9",
"public_ip": "10.3.1.9",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.3.1.10",
"public_ip": "10.3.1.10",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.3.1.11",
"public_ip": "10.3.1.11",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.3.1.12",
"public_ip": "10.3.1.12",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.3.1.13",
"public_ip": "10.3.1.13",
"rack": "rack1"
},
{
"node_type": "cassandra",
"private_ip": "10.3.1.14",
"public_ip": "10.3.1.14",
"rack": "rack1"
}
]
}
]
}
added phi_convict_threshold:12 (I don't think that resolves any immediate issues we've been seeing)
Also added num_tokens:256. According to the OpsC guys that will be sufficient to change the entire config to vnodes.
Any other ideas on config changes?
We should set num_tokens to 32. Higher than that introduces more performance penalty into search.
Ok. Changed to 32. Why is the default 256 if that is not suggested? Should we get someone to change that?
We've made a variety of changes to the config. While this is going to be an ongoing process, I'm closing this issue for now.
Our doc suggests 256 as the default value for num tokens. Switched it back to that. http://docs.datastax.com/en/cassandra/2.2/cassandra/configuration/configCassandra_yaml.html
After much internal discussion, we're changing the doc and default to 64. This reflects performance questions for Solr nodes.
Chuck has suggested the following changes to the default config in the opscenter.sh json. I'll work on this.
[3:33 PM] Chuck Droukas: Here you go: endpoint_snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch rpc_address: 0.0.0.0 hinted_handoff_enabled: 'false'
[3:34 PM] Chuck Droukas: For multi-DC: broadcast_rpc_address: 10.0.0.X broadcast_rpc_address: 10.1.0.X num_tokens: 30 phi_convict_threshold: 12 Remove:
remove: initial_token: 4611686018427387901