scylladb / scylladb

NoSQL data store using the seastar framework, compatible with Apache Cassandra
http://scylladb.com
GNU Affero General Public License v3.0
13.24k stars 1.26k forks source link

Create schema with autoresize rf for multidc cluster with tablets enabled and configured with on dc with single zero token node failed and terminate the c-s #20684

Open aleksbykov opened 2 days ago

aleksbykov commented 2 days ago

Packages

Scylla version: 6.2.0~dev-20240916.870d1c16f70c with build-id ba54fa6888566c0694ad3f85d3076f346281c16c

Kernel Version: 6.8.0-1016-aws

Issue description

I have configured cluster DC1(eu-westscylla_node_west): 3(token nodes); DC2(eu-west-2scylla_node_west): 4 (3 token and 1 zerotoken nodes); DC3 (eu-northscylla_node_north): 1 zero token node.

Cluster configured with tablets enabled. As work load next c-s stress command was started: cassandra-stress write cl=LOCAL_QUORUM n=20971520 -schema 'replication(strategy=NetworkTopologyStrategy,replication_factor=3) compaction(strategy=SizeTieredCompactionStrategy)' -port jmx=6868 -mode cql3 native -rate threads=80 -pop seq=1..20971520 -col 'n=FIXED(10) size=FIXED(512)' -log interval=5

But it terminated with error: Nodetool status shows only token nodes. ( known issue)

< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: Datacenter: eu-west-2scylla_node_west
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: =====================================
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: Status=Up/Down
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: |/ State=Normal/Leaving/Joining/Moving
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: -- Address    Load    Tokens Owns Host ID                              Rack
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: UN 10.3.0.71  1.67 MB 256    ?    0e1985f8-3e26-430c-a9bb-805f511e844d 2a  
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: UN 10.3.1.121 1.69 MB 256    ?    727cd8d2-f3d4-46b7-9d59-f8fcf2800928 2a  
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: UN 10.3.2.112 1.62 MB 256    ?    6bb5d7f3-3ad1-4c8a-8067-29dffe026e9e 2a  
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: Datacenter: eu-westscylla_node_west
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: ===================================
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: Status=Up/Down
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: |/ State=Normal/Leaving/Joining/Moving
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: -- Address    Load    Tokens Owns Host ID                              Rack
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: UN 10.4.0.208 1.60 MB 256    ?    6d67e87a-6491-4334-b0db-599874cdc14c 1a  
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: UN 10.4.3.157 1.60 MB 256    ?    1a339dc4-8e10-4444-a83b-f7374954e452 1a  
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: UN 10.4.3.98  1.51 MB 256    ?    a77bca02-2792-430d-a201-bba7c7916dd0 1a  
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: 
< t:2024-09-17 16:10:26,966 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.3.157>: Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless
< t:2024-09-17 16:10:26,974 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: WARN  [main] 2024-09-17 16:10:26,974 ControlConnection.java:1004 - Found invalid row in system.peers: [peer=/10.0.2.104, missing native_transport_address, missing native_transport_port, missing native_transport_port_ssl, tokens=null]. This is likely a gossip or snitch issue, this host will be ignored.
< t:2024-09-17 16:10:26,978 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: WARN  [main] 2024-09-17 16:10:26,977 ControlConnection.java:1004 - Found invalid row in system.peers: [peer=/10.3.1.205, missing native_transport_address, missing native_transport_port, missing native_transport_port_ssl, tokens=null]. This is likely a gossip or snitch issue, this host will be ignored.
< t:2024-09-17 16:10:27,006 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.3.1.185>: WARN  [cluster1-nio-worker-7] 2024-09-17 16:10:26,999 RequestHandler.java:303 - Query '[0 bound values] CREATE KEYSPACE IF NOT EXISTS "keyspace1" WITH replication = {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'replication_factor' : '3'} AND durable_writes = true;' generated server side warning(s): Tables in this keyspace will be replicated using Tablets and will not support CDC, LWT and counters features. To use CDC, LWT or counters, drop this keyspace and re-create it without tablets by adding AND TABLETS = {'enabled': false} to the CREATE KEYSPACE statement.
< t:2024-09-17 16:10:27,094 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: WARN  [main] 2024-09-17 16:10:27,093 ReplicationStategy.java:204 - Error while computing token map for keyspace keyspace1 with datacenter eu-northscylla_node_north: could not achieve replication factor 3 (found 0 replicas only), check your keyspace replication settings.
< t:2024-09-17 16:10:27,162 f:db_log_reader.py l:125  c:sdcm.db_log_reader   p:DEBUG > 2024-09-17T16:10:27.099+00:00 multi-dc-rackaware-feature--db-node-99408bc4-2     !INFO | systemd-logind[575]: New session 21 of user scyllaadm.
< t:2024-09-17 16:10:27,162 f:db_log_reader.py l:125  c:sdcm.db_log_reader   p:DEBUG > 2024-09-17T16:10:27.099+00:00 multi-dc-rackaware-feature--db-node-99408bc4-2     !INFO | systemd[1]: Started session-21.scope - Session 21 of User scyllaadm.
< t:2024-09-17 16:10:27,162 f:db_log_reader.py l:125  c:sdcm.db_log_reader   p:DEBUG > 2024-09-17T16:10:27.099+00:00 multi-dc-rackaware-feature--db-node-99408bc4-2     !INFO | scylla[6486]:  [shard 0: gms] schema_tables - Creating keyspace keyspace1
< t:2024-09-17 16:10:27,162 f:db_log_reader.py l:125  c:sdcm.db_log_reader   p:DEBUG > 2024-09-17T16:10:27.099+00:00 multi-dc-rackaware-feature--db-node-99408bc4-2     !INFO | scylla[6486]:  [shard 0: gms] migration_manager - Gossiping my schema version 5a2b954c-750f-11ef-8ad4-848a90059e08
< t:2024-09-17 16:10:27,162 f:db_log_reader.py l:125  c:sdcm.db_log_reader   p:DEBUG > 2024-09-17T16:10:27.099+00:00 multi-dc-rackaware-feature--db-node-99408bc4-2     !INFO | scylla[6486]:  [shard 0: gms] schema_tables - Schema version changed to 5a2b954c-750f-11ef-8ad4-848a90059e08
< t:2024-09-17 16:10:27,202 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: INFO  [main] 2024-09-17 16:10:27,201 RackAwareRoundRobinPolicy.java:112 - Using provided data-center name 'eu-westscylla_node_west' for RackAwareRoundRobinPolicy
< t:2024-09-17 16:10:27,202 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: INFO  [main] 2024-09-17 16:10:27,201 RackAwareRoundRobinPolicy.java:114 - Using provided rack name '1a' for RackAwareRoundRobinPolicy
< t:2024-09-17 16:10:27,204 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: INFO  [main] 2024-09-17 16:10:27,203 Cluster.java:1810 - New Cassandra host /10.3.1.121:9042 added
< t:2024-09-17 16:10:27,204 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: INFO  [main] 2024-09-17 16:10:27,203 Cluster.java:1810 - New Cassandra host /10.4.3.98:9042 added
< t:2024-09-17 16:10:27,204 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: INFO  [main] 2024-09-17 16:10:27,203 Cluster.java:1810 - New Cassandra host /10.4.3.157:9042 added
< t:2024-09-17 16:10:27,204 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: INFO  [main] 2024-09-17 16:10:27,203 Cluster.java:1810 - New Cassandra host /10.3.2.112:9042 added
< t:2024-09-17 16:10:27,204 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: INFO  [main] 2024-09-17 16:10:27,204 Cluster.java:1810 - New Cassandra host /10.4.0.208:9042 added
< t:2024-09-17 16:10:27,204 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: INFO  [main] 2024-09-17 16:10:27,204 Cluster.java:1810 - New Cassandra host /10.3.0.71:9042 added
< t:2024-09-17 16:10:27,205 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: Connected to cluster: multi-dc-rackaware-feature--db-cluster-99408bc4, max pending requests per connection null, max connections per host 8
< t:2024-09-17 16:10:27,206 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: Datatacenter: eu-westscylla_node_west; Host: /10.4.3.98; Rack: 1a
< t:2024-09-17 16:10:27,206 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: Datatacenter: eu-west-2scylla_node_west; Host: /10.3.2.112; Rack: 2a
< t:2024-09-17 16:10:27,206 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: Datatacenter: eu-westscylla_node_west; Host: /10.4.0.208; Rack: 1a
< t:2024-09-17 16:10:27,207 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: Datatacenter: eu-west-2scylla_node_west; Host: /10.3.1.121; Rack: 2a
< t:2024-09-17 16:10:27,207 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: Datatacenter: eu-west-2scylla_node_west; Host: /10.3.0.71; Rack: 2a
< t:2024-09-17 16:10:27,208 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: Datatacenter: eu-westscylla_node_west; Host: /10.4.3.157; Rack: 1a
< t:2024-09-17 16:10:27,214 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: WARN  [cluster1-worker-0] 2024-09-17 16:10:27,213 ReplicationStategy.java:204 - Error while computing token map for keyspace keyspace1 with datacenter eu-northscylla_node_north: could not achieve replication factor 3 (found 0 replicas only), check your keyspace replication settings.
< t:2024-09-17 16:10:27,226 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: INFO  [main] 2024-09-17 16:10:27,223 HostConnectionPool.java:200 - Using advanced port-based shard awareness with /10.4.3.98:9042
< t:2024-09-17 16:10:27,275 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: INFO  [main] 2024-09-17 16:10:27,274 HostConnectionPool.java:200 - Using advanced port-based shard awareness with /10.4.3.157:9042
< t:2024-09-17 16:10:27,294 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: INFO  [main] 2024-09-17 16:10:27,293 HostConnectionPool.java:200 - Using advanced port-based shard awareness with /10.4.0.208:9042

< t:2024-09-17 16:10:27,214 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.0.249>: WARN  [cluster1-worker-0] 2024-09-17 16:10:27,213 ReplicationStategy.java:204 - Error while computing token map for keyspace keyspace1 with datacenter eu-northscylla_node_north: could not achieve replication factor 3 (found 0 replicas only), check your keyspace replication settings.
Query is next: `CREATE KEYSPACE IF NOT EXISTS "keyspace1" WITH replication = {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'replication_factor' : '3'} AND durable_writes = true;' ` generated server side warning(s): Tables in this keyspace will be replicated using Tablets and will not support CDC, LWT and counters features. To use CDC, LWT or counters, drop this keyspace and re-create it without tablets by adding AND TABLETS = {'enabled': false} to the CREATE KEYSPACE statement.

C-s log contains next errors:

Connected to cluster: multi-dc-rackaware-feature--db-cluster-99408bc4, max pending requests per connection null, max connections per host 8
Datatacenter: eu-westscylla_node_west; Host: /10.4.3.98; Rack: 1a
Datatacenter: eu-west-2scylla_node_west; Host: /10.3.2.112; Rack: 2a
Datatacenter: eu-westscylla_node_west; Host: /10.4.0.208; Rack: 1a
Datatacenter: eu-west-2scylla_node_west; Host: /10.3.1.121; Rack: 2a
Datatacenter: eu-west-2scylla_node_west; Host: /10.3.0.71; Rack: 2a
Datatacenter: eu-westscylla_node_west; Host: /10.4.3.157; Rack: 1a
WARN  [cluster1-worker-0] 2024-09-17 16:10:27,213 ReplicationStategy.java:204 - Error while computing token map for keyspace keyspace1 with datacenter eu-northscylla_node_north: could not achieve replication factor 3 (found 0 replicas only), check your keyspace replication settings.
INFO  [main] 2024-09-17 16:10:27,223 HostConnectionPool.java:200 - Using advanced port-based shard awareness with /10.4.3.98:9042
INFO  [main] 2024-09-17 16:10:27,274 HostConnectionPool.java:200 - Using advanced port-based shard awareness with /10.4.3.157:9042
INFO  [main] 2024-09-17 16:10:27,293 HostConnectionPool.java:200 - Using advanced port-based shard awareness with /10.4.0.208:9042
WARN  [cluster1-worker-2] 2024-09-17 16:10:28,143 ReplicationStategy.java:204 - Error while computing token map for keyspace keyspace1 with datacenter eu-northscylla_node_north: could not achieve replication factor 3 (found 0 replicas only), check your keyspace replication settings.
java.lang.RuntimeException: Encountered exception creating schema
        at org.apache.cassandra.stress.settings.SettingsSchema.createKeySpacesNative(SettingsSchema.java:105)
        at org.apache.cassandra.stress.settings.SettingsSchema.createKeySpaces(SettingsSchema.java:74)
        at org.apache.cassandra.stress.settings.StressSettings.maybeCreateKeyspaces(StressSettings.java:230)
        at org.apache.cassandra.stress.StressAction.run(StressAction.java:58)
        at org.apache.cassandra.stress.Stress.run(Stress.java:143)
        at org.apache.cassandra.stress.Stress.main(Stress.java:62)
Caused by: com.datastax.driver.core.exceptions.InvalidConfigurationInQueryException: Datacenter eu-northscylla_node_north doesn't have enough token-owning nodes for replication_factor=3
        at com.datastax.driver.core.exceptions.InvalidConfigurationInQueryException.copy(InvalidConfigurationInQueryException.java:38)
        at com.datastax.driver.core.exceptions.InvalidConfigurationInQueryException.copy(InvalidConfigurationInQueryException.java:27)
        at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:35)
        at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:310)
        at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:58)
        at org.apache.cassandra.stress.util.JavaDriverClient.execute(JavaDriverClient.java:215)
        at org.apache.cassandra.stress.settings.SettingsSchema.createKeySpacesNative(SettingsSchema.java:94)
        ... 5 more

It seems, that auto-resize replication factor in multidc with zero token nodes is not working with tablets.

while run with https://jenkins.scylladb.com/job/scylla-staging/job/abykov/job/longevity-multi-dc-rack-aware-zero-token-dc/11/ with exactly same config and disabled tablets, c-s reported same warning but continue its work and start send traffic:

Datatacenter: eu-west-2scylla_node_west; Host: /10.3.1.171; Rack: 2a
Datatacenter: eu-westscylla_node_west; Host: /10.4.3.121; Rack: 1a
INFO  [main] 2024-09-18 12:55:58,068 HostConnectionPool.java:200 - Using advanced port-based shard awareness with /10.4.1.4:9042
INFO  [main] 2024-09-18 12:55:58,099 HostConnectionPool.java:200 - Using advanced port-based shard awareness with /10.4.3.214:9042
INFO  [main] 2024-09-18 12:55:58,115 HostConnectionPool.java:200 - Using advanced port-based shard awareness with /10.4.3.121:9042
WARN  [cluster1-worker-1] 2024-09-18 12:55:59,182 ReplicationStategy.java:204 - Error while computing token map for keyspace keyspace1 with datacenter eu-northscylla_node_north: could not achieve replication factor 3 (found 0 replicas only), check your keyspace replication settings.
WARN  [cluster1-worker-3] 2024-09-18 12:56:02,625 ReplicationStategy.java:204 - Error while computing token map for keyspace keyspace1 with datacenter eu-northscylla_node_north: could not achieve replication factor 3 (found 0 replicas only), check your keyspace replication settings.
Created keyspaces. Sleeping 3s for propagation.
Sleeping 2s...
Running WRITE with 80 threads for 20971520 iteration
type       total ops,    op/s,    pk/s,   row/s,    mean,     med,     .95,     .99,    .999,     max,   time,   stderr, errors,  gc: #,  max ms,  sum ms,  sdv ms,      mb
Failed to connect over JMX; not collecting these stats
total,         88601,   17720,   17720,   17720,     3.5,     2.5,     9.5,    19.1,    45.1,    79.2,    5.0,  0.00000,      0,      0,       0,       0,       0,       0
total,        256394,   33559,   33559,   33559,     

Keyspace created with next rf and DCes):

cqlsh> desc keyspace1;

CREATE KEYSPACE keyspace1 WITH replication = {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'eu-northscylla_node_north': '3', 'eu-west-2scylla_node_west': '3', 'eu-westscylla_node_west': '3'} AND durable_writes = true;

CREATE TABLE keyspace1.standard1 (
    key blob,
    "C0" blob,
    "C1" blob,
    "C2" blob,
    "C3" blob,
    "C4" blob,
    "C5" blob,
    "C6" blob,
    "C7" blob,
    "C8" blob,
    "C9" blob,
    PRIMARY KEY (key)
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {}
    AND crc_check_chance = 1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND speculative_retry = '99.0PERCENTILE'
    AND tombstone_gc = {'mode': 'timeout', 'propagation_delay_in_seconds': '3600'};

Impact

Describe the impact this issue causes to the user.

How frequently does it reproduce?

Describe the frequency with how this issue can be reproduced.

Installation details

Cluster size: 6 nodes (i4i.2xlarge)

Scylla Nodes used in this run:

OS / Image: ami-02a7f524e91e35664 ami-0094c7e2362289d20 ami-0e00401f614e6eb3a (aws: undefined_region)

Test: longevity-multi-dc-rack-aware-zero-token-dc Test id: 99408bc4-ff3a-48e0-9b37-65c3cf001435 Test name: scylla-staging/abykov/longevity-multi-dc-rack-aware-zero-token-dc Test method: longevity_test.LongevityTest.test_custom_time Test config file(s):

Logs and commands - Restore Monitor Stack command: `$ hydra investigate show-monitor 99408bc4-ff3a-48e0-9b37-65c3cf001435` - Restore monitor on AWS instance using [Jenkins job](https://jenkins.scylladb.com/view/QA/job/QA-tools/job/hydra-show-monitor/parambuild/?test_id=99408bc4-ff3a-48e0-9b37-65c3cf001435) - Show all stored logs command: `$ hydra investigate show-logs 99408bc4-ff3a-48e0-9b37-65c3cf001435` ## Logs: - **db-cluster-99408bc4.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/99408bc4-ff3a-48e0-9b37-65c3cf001435/20240917_161358/db-cluster-99408bc4.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/99408bc4-ff3a-48e0-9b37-65c3cf001435/20240917_161358/db-cluster-99408bc4.tar.gz) - **sct-runner-events-99408bc4.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/99408bc4-ff3a-48e0-9b37-65c3cf001435/20240917_161358/sct-runner-events-99408bc4.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/99408bc4-ff3a-48e0-9b37-65c3cf001435/20240917_161358/sct-runner-events-99408bc4.tar.gz) - **sct-99408bc4.log.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/99408bc4-ff3a-48e0-9b37-65c3cf001435/20240917_161358/sct-99408bc4.log.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/99408bc4-ff3a-48e0-9b37-65c3cf001435/20240917_161358/sct-99408bc4.log.tar.gz) - **loader-set-99408bc4.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/99408bc4-ff3a-48e0-9b37-65c3cf001435/20240917_161358/loader-set-99408bc4.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/99408bc4-ff3a-48e0-9b37-65c3cf001435/20240917_161358/loader-set-99408bc4.tar.gz) - **monitor-set-99408bc4.tar.gz** - [https://cloudius-jenkins-test.s3.amazonaws.com/99408bc4-ff3a-48e0-9b37-65c3cf001435/20240917_161358/monitor-set-99408bc4.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/99408bc4-ff3a-48e0-9b37-65c3cf001435/20240917_161358/monitor-set-99408bc4.tar.gz) [Jenkins job URL](https://jenkins.scylladb.com/job/scylla-staging/job/abykov/job/longevity-multi-dc-rack-aware-zero-token-dc/10/) [Argus](https://argus.scylladb.com/test/bbd702fb-2f87-4b0b-a068-c2c83d74cb77/runs?additionalRuns[]=99408bc4-ff3a-48e0-9b37-65c3cf001435)
kbr-scylla commented 2 days ago

CREATE KEYSPACE IF NOT EXISTS "keyspace1" WITH replication = {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'replication_factor' : '3'}

Isn't replication factor auto-expansion supposed to be rejected in tablets mode? cc @ptrsmrn @bhalevy

bhalevy commented 2 days ago

CREATE KEYSPACE IF NOT EXISTS "keyspace1" WITH replication = {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'replication_factor' : '3'}

Isn't replication factor auto-expansion supposed to be rejected in tablets mode? cc @ptrsmrn @bhalevy

Only in ALTER KEYSPACE. It's fine in CREATE

kbr-scylla commented 4 hours ago

We should reject the attempts to assign more RF than there are nodes in a given DC, as described in https://github.com/scylladb/scylladb/issues/20356. (Whether through auto-expansion or not.)

Then the initial CREATE here would fail. And the test would have to be adjusted. Which BTW you can do right away @aleksbykov so it doesn't block further testing of zero-token nodes

Still it's interesting why:

I suspect that this is somehow related to the tablets-specific routing logic in the driver. The vnodes logic somehow handles the situation (perhaps ignoring the DC and routing to other DCs?) while the tablets logic doesn't.

cc @fruch @dkropachev @sylwiaszunejko

But I'm not sure.

I also suspect that the error doesn't require zero-token nodes

Error while computing token map for keyspace keyspace1 with datacenter eu-northscylla_node_north: could not achieve replication factor 3 (found 0 replicas only), check your keyspace replication settings

it just requires the RF to be greater than the number of nodes in this DC.

Anyway I think the investigation should be started from the drivers angle, and if it turns out that there's a difference between vnodes and tablets logic, a decision should be made if we're to handle it differently in drivers, or perhaps rely on https://github.com/scylladb/scylladb/issues/20356 for a complete fix.

Assigning to the drivers team for now.