k8ssandra / k8ssandra-operator

The Kubernetes operator for K8ssandra
https://k8ssandra.io/
Apache License 2.0
162 stars 75 forks source link

Upgrading Cassandra Version 3 to 4 not possible #1110

Closed ClusterJan closed 10 months ago

ClusterJan commented 11 months ago

What happened?

When trying to upgrade the Cassandra version from 3.11.16 to 4.1.2 the upgrade fails, because of a cluster name mismatch.

Did you expect to see something different?

The cluster name should not change, and the upgrade should work flawless.

How to reproduce it (as minimally and precisely as possible):

Environment

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: cass-jaschwege-ri-k8ssandra
  namespace: jaschwege
spec:
  auth: false
  cassandra:
    clusterName: jaschwege-cluster
    datacenters:
    - config:
        cassandraYaml:
          concurrent_compactors: 6
          num_tokens: 256
        jvmOptions:
          gc: G1GC
          heap_initial_size: 1G
          heap_max_size: 1G
      metadata:
        name: kubernetes-1
      perNodeConfigInitContainerImage: mikefarah/yq:4
      size: 3
      stopped: false
      storageConfig:
        cassandraDataVolumeClaimSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 5Gi
          storageClassName: ebs-k8ssandra
    networking:
      hostNetwork: false
    perNodeConfigInitContainerImage: mikefarah/yq:4
    racks:
    - name: eu-central-1a
      nodeAffinityLabels:
        topology.kubernetes.io/zone: eu-central-1a
    resources:
      limits:
        cpu: 1
        memory: 2Gi
      requests:
        cpu: 1
        memory: 2Gi
    serverType: cassandra
    serverVersion: 3.11.16
    superuserSecretRef:
      name: jaschwege-cluster-superuser
    telemetry:
      mcac:
        enabled: false
      prometheus:
        enabled: false
      vector:
        components:
          sinks:
          - config: endpoint = "http://prometheus-0.prometheus.system-monitoring.svc.cluster.local:9090/api/v1/write"
            inputs:
            - enrich_host_metrics
            - cassandra_metrics
            name: prometheus_remote_write
            type: prometheus_remote_write
          sources:
          - config: |-
              filesystem.devices.excludes = ["binfmt_misc"]
              filesystem.filesystems.excludes = ["binfmt_misc"]
              filesystem.mountpoints.excludes = ["*/proc/sys/fs/binfmt_misc"]
              scrape_interval_secs = 30
            name: host_metrics
            type: host_metrics
          transforms:
          - config: |
              source = '''
              .tags.cluster = get_env_var!("CLUSTER_NAME")
              .tags.datacenter = get_env_var!("DATACENTER_NAME")
              .tags.namespace = get_env_var!("NAMESPACE")
              .tags.rack = get_env_var!("RACK_NAME")
              '''
            inputs:
            - host_metrics
            name: enrich_host_metrics
            type: remap
        enabled: true
        resources:
          limits:
            cpu: 500m
            memory: 500Mi
          requests:
            cpu: 500m
            memory: 500Mi
  secretsProvider: internal
tem.log
INFO  [main] 2023-11-03 14:16:46,579 SystemDistributedReplicationInterceptor.java:115 - Using override for distributed system keyspaces: {kubernetes-1=3, class=NetworkTopologyStrategy}
INFO  [main] 2023-11-03 14:16:47,983 YamlConfigurationLoader.java:104 - Configuration location: file:/etc/cassandra/cassandra.yaml
INFO  [main] 2023-11-03 14:16:48,666 Config.java:1171 - Node configuration:[allocate_tokens_for_keyspace=null; allocate_tokens_for_local_replication_factor=3; allow_extra_insecure_udfs=false; allow_filtering_enabled=true; allow_insecure_u
dfs=false; audit_logging_options=AuditLogOptions{enabled=false, logger='BinAuditLogger', included_keyspaces='', excluded_keyspaces='system,system_schema,system_virtual_schema', included_categories='', excluded_categories='', included_user
s='', excluded_users='', audit_logs_dir='/opt/cassandra/logs/audit', archive_command='', roll_cycle='HOURLY', block=true, max_queue_weight=268435456, max_log_size=17179869184, max_archive_retries=10}; auth_cache_warming_enabled=false; aut
h_read_consistency_level=LOCAL_QUORUM; auth_write_consistency_level=EACH_QUORUM; authenticator=AllowAllAuthenticator; authorizer=AllowAllAuthorizer; auto_bootstrap=true; auto_hints_cleanup_enabled=false; auto_optimise_full_repair_streams=
false; auto_optimise_inc_repair_streams=false; auto_optimise_preview_repair_streams=false; auto_snapshot=true; auto_snapshot_ttl=null; autocompaction_on_startup_enabled=true; automatic_sstable_upgrade=false; available_processors=-1; back_
pressure_enabled=false; back_pressure_strategy=null; batch_size_fail_threshold=50KiB; batch_size_warn_threshold=5KiB; batchlog_replay_throttle=1024KiB; block_for_peers_in_remote_dcs=false; block_for_peers_timeout_in_secs=10; broadcast_add
ress=null; broadcast_rpc_address=10.42.207.81; buffer_pool_use_heap_if_exhausted=false; cache_load_timeout=30s; cas_contention_timeout=1000ms; cdc_block_writes=true; cdc_enabled=false; cdc_free_space_check_interval=250ms; cdc_raw_director
y=null; cdc_total_space=0MiB; check_for_duplicate_rows_during_compaction=true; check_for_duplicate_rows_during_reads=true; client_encryption_options=<REDACTED>; client_error_reporting_exclusions=SubnetGroups{subnets=[]}; cluster_name=Test
 Cluster; collection_size_fail_threshold=null; collection_size_warn_threshold=null; column_index_cache_size=2KiB; column_index_size=64KiB; columns_per_table_fail_threshold=-1; columns_per_table_warn_threshold=-1; commit_failure_policy=sto
p; commitlog_compression=null; commitlog_directory=null; commitlog_max_compression_buffers_in_pool=3; commitlog_periodic_queue_size=-1; commitlog_segment_size=32MiB; commitlog_sync=periodic; commitlog_sync_batch_window_in_ms=NaN; commitlo
g_sync_group_window=0ms; commitlog_sync_period=10000ms; commitlog_total_space=null; compact_tables_enabled=true; compaction_large_partition_warning_threshold=100MiB; compaction_throughput=64MiB/s; compaction_tombstone_warning_threshold=10
0000; concurrent_compactors=6; concurrent_counter_writes=32; concurrent_materialized_view_builders=1; concurrent_materialized_view_writes=32; concurrent_reads=32; concurrent_replicates=null; concurrent_validations=0; concurrent_writes=32;
 consecutive_message_errors_threshold=1; coordinator_read_size_fail_threshold=null; coordinator_read_size_warn_threshold=null; corrupted_tombstone_strategy=disabled; counter_cache_keys_to_save=2147483647; counter_cache_save_period=7200s;
counter_cache_size=null; counter_write_request_timeout=5000ms; credentials_cache_active_update=false; credentials_cache_max_entries=1000; credentials_update_interval=null; credentials_validity=2000ms; data_disk_usage_max_disk_size=null; d
ata_disk_usage_percentage_fail_threshold=-1; data_disk_usage_percentage_warn_threshold=-1; data_file_directories=[Ljava.lang.String;@33956d1a; default_keyspace_rf=1; denylist_consistency_level=QUORUM; denylist_initial_load_retry=5s; denyl
ist_max_keys_per_table=1000; denylist_max_keys_total=10000; denylist_range_reads_enabled=true; denylist_reads_enabled=true; denylist_refresh=600s; denylist_writes_enabled=true; diagnostic_events_enabled=false; disk_access_mode=auto; disk_
failure_policy=stop; disk_optimization_estimate_percentile=0.95; disk_optimization_page_cross_chance=0.1; disk_optimization_strategy=ssd; drop_compact_storage_enabled=false; drop_truncate_table_enabled=true; dynamic_snitch=true; dynamic_s
nitch_badness_threshold=1.0; dynamic_snitch_reset_interval=600000ms; dynamic_snitch_update_interval=100ms; endpoint_snitch=GossipingPropertyFileSnitch; entire_sstable_inter_dc_stream_throughput_outbound=24MiB/s; entire_sstable_stream_thro
ughput_outbound=24MiB/s; failure_detector=FailureDetector; fields_per_udt_fail_threshold=-1; fields_per_udt_warn_threshold=-1; file_cache_enabled=false; file_cache_round_up=null; file_cache_size=null; flush_compression=fast; force_new_pre
pared_statement_behaviour=false; full_query_logging_options=FullQueryLoggerOptions{log_dir='', archive_command='', roll_cycle='HOURLY', block=true, max_queue_weight=268435456, max_log_size=17179869184}; gc_log_threshold=200ms; gc_warn_thr
eshold=1s; group_by_enabled=true; hint_window_persistent_enabled=true; hinted_handoff_disabled_datacenters=[]; hinted_handoff_enabled=true; hinted_handoff_throttle=1024KiB; hints_compression=null; hints_directory=null; hints_flush_period=
10000ms; ideal_consistency_level=null; in_select_cartesian_product_fail_threshold=-1; in_select_cartesian_product_warn_threshold=-1; incremental_backups=false; index_summary_capacity=null; index_summary_resize_interval=60m; initial_range_
tombstone_list_allocation_size=1; initial_token=null; inter_dc_stream_throughput_outbound=24MiB/s; inter_dc_tcp_nodelay=false; internode_application_receive_queue_capacity=4MiB; internode_application_receive_queue_reserve_endpoint_capacit
y=128MiB; internode_application_receive_queue_reserve_global_capacity=512MiB; internode_application_send_queue_capacity=4MiB; internode_application_send_queue_reserve_endpoint_capacity=128MiB; internode_application_send_queue_reserve_glob
al_capacity=512MiB; internode_authenticator=null; internode_compression=dc; internode_error_reporting_exclusions=SubnetGroups{subnets=[]}; internode_max_message_size=null; internode_socket_receive_buffer_size=0B; internode_socket_send_buf
fer_size=0B; internode_streaming_tcp_user_timeout=300s; internode_tcp_connect_timeout=2s; internode_tcp_user_timeout=30s; internode_timeout=true; items_per_collection_fail_threshold=-1; items_per_collection_warn_threshold=-1; key_cache_ke
ys_to_save=2147483647; key_cache_migrate_during_compaction=true; key_cache_save_period=4h; key_cache_size=null; keyspace_count_warn_threshold=40; keyspaces_fail_threshold=-1; keyspaces_warn_threshold=-1; listen_address=10.42.207.81; liste
n_interface=null; listen_interface_prefer_ipv6=false; listen_on_broadcast_address=false; local_read_size_fail_threshold=null; local_read_size_warn_threshold=null; local_system_data_file_directory=null; materialized_views_enabled=false; ma
terialized_views_per_table_fail_threshold=-1; materialized_views_per_table_warn_threshold=-1; max_concurrent_automatic_sstable_upgrades=1; max_hint_window=3h; max_hints_delivery_threads=2; max_hints_file_size=128MiB; max_hints_size_per_ho
st=0B; max_mutation_size=null; max_streaming_retries=3; max_top_size_partition_count=10; max_top_tombstone_partition_count=10; max_value_size=256MiB; memtable=null; memtable_allocation_type=heap_buffers; memtable_cleanup_threshold=null; m
emtable_flush_writers=0; memtable_heap_space=null; memtable_offheap_space=null; min_free_space_per_drive=50MiB; min_tracked_partition_size=1MiB; min_tracked_partition_tombstone_count=5000; minimum_replication_factor_fail_threshold=-1; min
imum_replication_factor_warn_threshold=-1; native_transport_allow_older_protocols=true; native_transport_flush_in_batches_legacy=false; native_transport_idle_timeout=0ms; native_transport_max_concurrent_connections=-1; native_transport_ma
x_concurrent_connections_per_ip=-1; native_transport_max_frame_size=16MiB; native_transport_max_negotiable_protocol_version=null; native_transport_max_request_data_in_flight=null; native_transport_max_request_data_in_flight_per_ip=null; n
ative_transport_max_requests_per_second=1000000; native_transport_max_threads=128; native_transport_port=9042; native_transport_port_ssl=null; native_transport_rate_limiting_enabled=false; native_transport_receive_queue_capacity=1MiB; net
work_authorizer=AllowAllNetworkAuthorizer; networking_cache_size=null; num_tokens=256; otc_backlog_expiration_interval_ms=200; otc_coalescing_enough_coalesced_messages=8; otc_coalescing_strategy=DISABLED; otc_coalescing_window_us=200; pag
e_size_fail_threshold=-1; page_size_warn_threshold=-1; partition_denylist_enabled=false; partition_keys_in_select_fail_threshold=-1; partition_keys_in_select_warn_threshold=-1; partitioner=org.apache.cassandra.dht.Murmur3Partitioner; paxo
s_cache_size=null; paxos_contention_max_wait=null; paxos_contention_min_delta=null; paxos_contention_min_wait=null; paxos_contention_wait_randomizer=null; paxos_on_linearizability_violations=ignore; paxos_purge_grace_period=60s; paxos_rep
air_enabled=true; paxos_repair_parallelism=-1; paxos_state_purging=null; paxos_topology_repair_no_dc_checks=false; paxos_topology_repair_strict_each_quorum=false; paxos_variant=v1; periodic_commitlog_sync_lag_block=null; permissions_cache
_active_update=false; permissions_cache_max_entries=1000; permissions_update_interval=null; permissions_validity=2000ms; phi_convict_threshold=8.0; prepared_statements_cache_size=null; range_request_timeout=10000ms; range_tombstone_list_g
rowth_factor=1.5; read_before_write_list_operations_enabled=true; read_consistency_levels_disallowed=[]; read_consistency_levels_warned=[]; read_request_timeout=5000ms; read_thresholds_enabled=false; reject_repair_compaction_threshold=214
7483647; repair_command_pool_full_strategy=queue; repair_command_pool_size=0; repair_request_timeout=120000ms; repair_session_max_tree_depth=null; repair_session_space=null; repair_state_expires=3d; repair_state_size=100000; repaired_data
_tracking_for_partition_reads_enabled=false; repaired_data_tracking_for_range_reads_enabled=false; report_unconfirmed_repaired_data_mismatches=false; request_timeout=10000ms; role_manager=CassandraRoleManager; roles_cache_active_update=fa
lse; roles_cache_max_entries=1000; roles_update_interval=null; roles_validity=2000ms; row_cache_class_name=org.apache.cassandra.cache.OHCProvider; row_cache_keys_to_save=2147483647; row_cache_save_period=0s; row_cache_size=0MiB; row_index
_read_size_fail_threshold=null; row_index_read_size_warn_threshold=null; rpc_address=0.0.0.0; rpc_interface=null; rpc_interface_prefer_ipv6=false; rpc_keepalive=true; sasi_indexes_enabled=false; saved_caches_directory=null; scripted_user_
defined_functions_enabled=false; secondary_indexes_enabled=true; secondary_indexes_per_table_fail_threshold=-1; secondary_indexes_per_table_warn_threshold=-1; seed_provider=org.apache.cassandra.locator.K8SeedProvider{seeds=jaschwege-clust
er-seed-service,jaschwege-cluster-kubernetes-1-additional-seed-service}; server_encryption_options=<REDACTED>; skip_paxos_repair_on_topology_change=false; skip_paxos_repair_on_topology_change_keyspaces=[]; slow_query_log_timeout=500ms; sn
apshot_before_compaction=false; snapshot_links_per_second=0; snapshot_on_duplicate_row_detection=false; snapshot_on_repaired_data_mismatch=false; ssl_storage_port=7001; sstable_preemptive_open_interval=50MiB; start_native_transport=true;
startup_checks={}; storage_port=7000; stream_entire_sstables=true; stream_throughput_outbound=24MiB/s; streaming_connections_per_host=1; streaming_keep_alive_period=300s; streaming_slow_events_log_timeout=10s; streaming_state_expires=3d;
streaming_state_size=40MiB; streaming_stats_enabled=true; table_count_warn_threshold=150; table_properties_disallowed=[]; table_properties_ignored=[]; table_properties_warned=[]; tables_fail_threshold=-1; tables_warn_threshold=-1; tombsto
ne_failure_threshold=100000; tombstone_warn_threshold=1000; top_partitions_enabled=true; trace_type_query_ttl=1d; trace_type_repair_ttl=7d; transient_replication_enabled=false; transparent_data_encryption_options=org.apache.cassandra.conf
ig.TransparentDataEncryptionOptions@2e1ddc90; traverse_auth_from_root=false; trickle_fsync=false; trickle_fsync_interval=10240KiB; truncate_request_timeout=60000ms; uncompressed_tables_enabled=true; unlogged_batch_across_partitions_warn_t
hreshold=10; use_deterministic_table_id=false; use_offheap_merkle_trees=true; use_statements_enabled=true; user_defined_functions_enabled=false; user_defined_functions_fail_timeout=1500ms; user_defined_functions_threads_enabled=true; user
_defined_functions_warn_timeout=500ms; user_function_timeout_policy=die; user_timestamps_enabled=true; uuid_sstable_identifiers_enabled=false; validation_preview_purge_head_start=3600s; windows_timer_interval=0; write_consistency_levels_d
isallowed=[]; write_consistency_levels_warned=[]; write_request_timeout=2000ms]
INFO  [main] 2023-11-03 14:16:48,668 DatabaseDescriptor.java:460 - DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
INFO  [main] 2023-11-03 14:16:48,669 DatabaseDescriptor.java:514 - Global memtable on-heap threshold is enabled at 240MiB
INFO  [main] 2023-11-03 14:16:48,669 DatabaseDescriptor.java:518 - Global memtable off-heap threshold is enabled at 240MiB

...

INFO  [main] 2023-11-03 14:16:50,089 CassandraDaemon.java:638 - Classpath: /opt/cassandra/conf:/opt/cassandra/lib/HdrHistogram-2.1.9.jar:/opt/cassandra/lib/ST4-4.0.8.jar:/opt/cassandra/lib/airline-0.8.jar:/opt/cassandra/lib/antlr-runtime-
3.5.2.jar:/opt/cassandra/lib/apache-cassandra-4.1.2.jar:/opt/cassandra/lib/asm-9.1.jar:/opt/cassandra/lib/caffeine-2.9.2.jar:/opt/cassandra/lib/cassandra-driver-core-3.11.0-shaded.jar:/opt/cassandra/lib/checker-qual-3.10.0.jar:/opt/cassan
dra/lib/chronicle-bytes-2.20.111.jar:/opt/cassandra/lib/chronicle-core-2.20.126.jar:/opt/cassandra/lib/chronicle-queue-5.20.123.jar:/opt/cassandra/lib/chronicle-threads-2.20.111.jar:/opt/cassandra/lib/chronicle-wire-2.20.117.jar:/opt/cass
andra/lib/commons-cli-1.1.jar:/opt/cassandra/lib/commons-codec-1.9.jar:/opt/cassandra/lib/commons-lang3-3.11.jar:/opt/cassandra/lib/commons-math3-3.2.jar:/opt/cassandra/lib/concurrent-trees-2.4.0.jar:/opt/cassandra/lib/ecj-4.6.1.jar:/opt/
cassandra/lib/error_prone_annotations-2.5.1.jar:/opt/cassandra/lib/guava-27.0-jre.jar:/opt/cassandra/lib/high-scale-lib-1.0.6.jar:/opt/cassandra/lib/hppc-0.8.1.jar:/opt/cassandra/lib/ipaddress-5.3.3.jar:/opt/cassandra/lib/j2objc-annotatio
ns-1.3.jar:/opt/cassandra/lib/jackson-annotations-2.13.2.jar:/opt/cassandra/lib/jackson-core-2.13.2.jar:/opt/cassandra/lib/jackson-databind-2.13.2.2.jar:/opt/cassandra/lib/jackson-datatype-jsr310-2.13.2.jar:/opt/cassandra/lib/jamm-0.3.2.j
ar:/opt/cassandra/lib/java-cup-runtime-11b-20160615.jar:/opt/cassandra/lib/javax.inject-1.jar:/opt/cassandra/lib/jbcrypt-0.4.jar:/opt/cassandra/lib/jcl-over-slf4j-1.7.25.jar:/opt/cassandra/lib/jcommander-1.30.jar:/opt/cassandra/lib/jctool
s-core-3.1.0.jar:/opt/cassandra/lib/jflex-1.8.2.jar:/opt/cassandra/lib/jna-5.9.0.jar:/opt/cassandra/lib/json-simple-1.1.jar:/opt/cassandra/lib/jsr305-2.0.2.jar:/opt/cassandra/lib/jvm-attach-api-1.5.jar:/opt/cassandra/lib/log4j-over-slf4j-
1.7.25.jar:/opt/cassandra/lib/logback-classic-1.2.9.jar:/opt/cassandra/lib/logback-core-1.2.9.jar:/opt/cassandra/lib/lz4-java-1.8.0.jar:/opt/cassandra/lib/metrics-core-3.1.5.jar:/opt/cassandra/lib/metrics-jvm-3.1.5.jar:/opt/cassandra/lib/
metrics-logback-3.1.5.jar:/opt/cassandra/lib/mxdump-0.14.jar:/opt/cassandra/lib/netty-all-4.1.58.Final.jar:/opt/cassandra/lib/netty-tcnative-boringssl-static-2.0.36.Final.jar:/opt/cassandra/lib/ohc-core-0.5.1.jar:/opt/cassandra/lib/ohc-co
re-j8-0.5.1.jar:/opt/cassandra/lib/psjava-0.1.19.jar:/opt/cassandra/lib/reporter-config-base-3.0.3.jar:/opt/cassandra/lib/reporter-config3-3.0.3.jar:/opt/cassandra/lib/sigar-1.6.4.jar:/opt/cassandra/lib/sjk-cli-0.14.jar:/opt/cassandra/lib
/sjk-core-0.14.jar:/opt/cassandra/lib/sjk-json-0.14.jar:/opt/cassandra/lib/sjk-stacktrace-0.14.jar:/opt/cassandra/lib/slf4j-api-1.7.25.jar:/opt/cassandra/lib/snakeyaml-1.26.jar:/opt/cassandra/lib/snappy-java-1.1.8.4.jar:/opt/cassandra/lib
/snowball-stemmer-1.3.0.581.1.jar:/opt/cassandra/lib/stream-2.5.2.jar:/opt/cassandra/lib/zstd-jni-1.5.5-1.jar:/opt/cassandra/lib/jsr223/*/*.jar:
INFO  [main] 2023-11-03 14:16:50,089 CassandraDaemon.java:640 - JVM Arguments: [-Xms1000000000, -Xmx1000000000, -ea, -da:net.openhft..., -XX:+UseThreadPriorities, -XX:+HeapDumpOnOutOfMemoryError, -Xss256k, -XX:+AlwaysPreTouch, -XX:-UseBia
sedLocking, -XX:+UseTLAB, -XX:+ResizeTLAB, -XX:+UseNUMA, -XX:+PerfDisableSharedMem, -Djava.net.preferIPv4Stack=true, -Djdk.attach.allowAttachSelf=true, --add-exports=java.base/jdk.internal.misc=ALL-UNNAMED, --add-exports=java.base/jdk.int
ernal.ref=ALL-UNNAMED, --add-exports=java.base/sun.nio.ch=ALL-UNNAMED, --add-exports=java.management.rmi/com.sun.jmx.remote.internal.rmi=ALL-UNNAMED, --add-exports=java.rmi/sun.rmi.registry=ALL-UNNAMED, --add-exports=java.rmi/sun.rmi.serv
er=ALL-UNNAMED, --add-exports=java.sql/java.sql=ALL-UNNAMED, --add-opens=java.base/java.lang.module=ALL-UNNAMED, --add-opens=java.base/jdk.internal.loader=ALL-UNNAMED, --add-opens=java.base/jdk.internal.ref=ALL-UNNAMED, --add-opens=java.b
ase/jdk.internal.reflect=ALL-UNNAMED, --add-opens=java.base/jdk.internal.math=ALL-UNNAMED, --add-opens=java.base/jdk.internal.module=ALL-UNNAMED, --add-opens=java.base/jdk.internal.util.jar=ALL-UNNAMED, --add-opens=jdk.management/com.sun.
management.internal=ALL-UNNAMED, -Dio.netty.tryReflectionSetAccessible=true, -XX:+UseG1GC, -XX:+ParallelRefProcEnabled, -XX:MaxTenuringThreshold=1, -XX:G1HeapRegionSize=16m, -XX:G1RSetUpdatingPauseTimePercent=5, -XX:MaxGCPauseMillis=300,
-XX:InitiatingHeapOccupancyPercent=70, -Xlog:gc=info,heap*=trace,age*=debug,safepoint=info,promotion*=trace:file=/opt/cassandra/logs/gc.log:time,uptime,pid,tid,level:filecount=10,filesize=10485760, -XX:CompileCommandFile=/opt/cassandra/co
nf/hotspot_compiler, -javaagent:/opt/cassandra/lib/jamm-0.3.2.jar, -Dcassandra.jmx.local.port=7199, -Dcom.sun.management.jmxremote.authenticate=false, -Dcom.sun.management.jmxremote.password.file=/etc/cassandra/jmxremote.password, -Djava.
library.path=/opt/cassandra/lib/sigar-bin, -Dcassandra.allow_alter_rf_during_range_movement=true, -Dcassandra.system_distributed_replication=kubernetes-1:3, -Dcom.sun.management.jmxremote.authenticate=false, -javaagent:/opt/management-api
/datastax-mgmtapi-agent.jar, -Dcassandra.libjemalloc=/usr/local/lib/libjemalloc.so, -XX:OnOutOfMemoryError=kill -9 %p, -Dlogback.configurationFile=logback.xml, -Dcassandra.logdir=/opt/cassandra/logs, -Dcassandra.storagedir=/opt/cassandra/
data, -Dcassandra.server_process, -Dcassandra.skip_default_role_setup=true, -Ddb.unix_socket_file=/tmp/cassandra.sock]
WARN  [main] 2023-11-03 14:16:50,166 NativeLibrary.java:199 - Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out, especially with mmapped I/O enabled. Increase RLIMIT_MEMLOCK.

...

INFO  [main] 2023-11-03 14:16:53,988 ColumnFamilyStore.java:484 - Initializing system.transferred_ranges_v2
INFO  [main] 2023-11-03 14:16:53,993 ColumnFamilyStore.java:484 - Initializing system.transferred_ranges
INFO  [main] 2023-11-03 14:16:53,997 ColumnFamilyStore.java:484 - Initializing system.view_builds_in_progress
INFO  [main] 2023-11-03 14:16:54,001 ColumnFamilyStore.java:484 - Initializing system.built_views
INFO  [main] 2023-11-03 14:16:54,067 ColumnFamilyStore.java:484 - Initializing system.prepared_statements
INFO  [main] 2023-11-03 14:16:54,072 ColumnFamilyStore.java:484 - Initializing system.repairs
INFO  [main] 2023-11-03 14:16:54,077 ColumnFamilyStore.java:484 - Initializing system.top_partitions
INFO  [main] 2023-11-03 14:16:54,186 QueryProcessor.java:129 - Initialized prepared statement caches with 10 MiB
2023-11-03T14:17:03.472034Z  WARN source{component_kind="source" component_id=cassandra_metrics_raw component_type=prometheus_scrape component_name=cassandra_metrics_raw}:http: vector::internal_events::http_client: HTTP error. error=error
 trying to connect: tcp connect error: Cannot assign requested address (os error 99) error_type="request_failed" stage="processing" internal_log_rate_limit=true
2023-11-03T14:17:03.472076Z ERROR source{component_kind="source" component_id=cassandra_metrics_raw component_type=prometheus_scrape component_name=cassandra_metrics_raw}: vector::internal_events::http_client_source: HTTP request processi
ng error. url=http://localhost:9000/metrics error=CallRequest { source: hyper::Error(Connect, Custom { kind: Other, error: ConnectError("tcp connect error", Os { code: 99, kind: AddrNotAvailable, message: "Cannot assign requested address"
 }) }) } error_type="request_failed" stage="receiving" internal_log_rate_limit=true
ERROR [main] 2023-11-03 14:16:54,566 CassandraDaemon.java:897 - Fatal exception during initialization
org.apache.cassandra.exceptions.ConfigurationException: Saved cluster name jaschwege-cluster != configured name Test Cluster
    at org.apache.cassandra.db.SystemKeyspace.checkHealth(SystemKeyspace.java:1100)
    at org.apache.cassandra.service.StartupChecks$13.execute(StartupChecks.java:630)
    at org.apache.cassandra.service.StartupChecks.verify(StartupChecks.java:174)
    at org.apache.cassandra.service.CassandraDaemon.runStartupChecks(CassandraDaemon.java:502)
    at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:256)
    at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:751)
    at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:875)

Anything else we need to know?:

A downgrade to 3.11.16 after manually modifying the stateful set is also not possible. No logs printed.

ClusterJan commented 11 months ago

On an additional note: after Alexander told me that the creation of the configuration has been changed with Cassandra 4.1+, I tried upgrading my cluster to 4.0.10 (cass-management-api not available for 4.0.11) before upgrading to 4.1.2.

The upgrade to 4.0.10 worked, but when trying to upgrade to 4.1.2 the same error as mentioned above occurs.

dnugmanov commented 11 months ago

It relates to https://github.com/k8ssandra/k8ssandra-client/issues/17 can be fixed by https://github.com/k8ssandra/k8ssandra-client/pull/18

adejanovski commented 10 months ago

This is fixed in k8ssandra-operator v1.10.3 which was released a few minutes ago.