k8ssandra / k8ssandra-operator

The Kubernetes operator for K8ssandra
https://k8ssandra.io/
Apache License 2.0
172 stars 78 forks source link

NoSpamLogger.java - Maximum memory usage reached cannot allocate chunk #1211

Open KEDLogic opened 9 months ago

KEDLogic commented 9 months ago

We have a k8ssandra cluster in production, 3 racks, each with 2 nodes. Using AWS EKS, m5.2xlarge instances.

k8ssandra given 30GB

Noted issue, spamming logs:

NoSpamLogger.java - Maximum memory usage reached cannot allocate chunk

Increased file_cache_size_in_mb from 1024 to 2048

Log quantity reduced, but still present.

Environment

AWS EKS v 1.25

      cassandraYaml:
        ssl_storage_port: 7001
        storage_port: 7000
        batchlog_replay_throttle_in_kb: 1024
        commit_failure_policy: stop
        unlogged_batch_across_partitions_warn_threshold: 10
        commitlog_segment_size_in_mb: 32
        start_rpc: true
        credentials_validity_in_ms: 2000
        client_encryption_options:
          enabled: false
        concurrent_materialized_view_writes: 32
        inter_dc_tcp_nodelay: false
        column_index_cache_size_in_kb: 2
        streaming_socket_timeout_in_ms: 72000000
        rpc_server_type: sync
        row_cache_save_period: 0
        disk_failure_policy: stop
        native_transport_port: 9042
        enable_user_defined_functions_threads: true
        server_encryption_options:
          internode_encryption: none
        dynamic_snitch_reset_interval_in_ms: 600000
        compaction_throughput_mb_per_sec: 64
        role_manager: org.apache.cassandra.auth.CassandraRoleManager
        column_index_size_in_kb: 64
        batch_size_warn_threshold_in_kb: 64
        windows_timer_interval: 1
        compaction_large_partition_warning_threshold_mb: 100
        rpc_keepalive: true
        commitlog_total_space_in_mb: 25600
        memtable_heap_space_in_mb: 2048
        batch_size_fail_threshold_in_kb: 640
        snapshot_before_compaction: false
        tracetype_query_ttl: 86400
        concurrent_reads: 32
        key_cache_save_period: 14400
        row_cache_size_in_mb: 0
        tracetype_repair_ttl: 604800
        enable_materialized_views: true
        tombstone_warn_threshold: 1000
        rpc_address: 0.0.0.0
        concurrent_writes: 32
        commitlog_sync: periodic
        counter_cache_save_period: 7200
        file_cache_size_in_mb: 2048
        back_pressure_enabled: false
        enable_sasi_indexes: true
        slow_query_log_timeout_in_ms: 500
        trickle_fsync: true
        write_request_timeout_in_ms: 2000
        incremental_backups: true
        truncate_request_timeout_in_ms: 60000
        enable_scripted_user_defined_functions: false
        read_request_timeout_in_ms: 5000
        request_timeout_in_ms: 10000
        start_native_transport: true
        memtable_allocation_type: offheap_objects
        transparent_data_encryption_options:
          enabled: false
          chunk_length_kb: 64
          cipher: AES/CBC/PKCS5Padding
          key_alias: testing:1
        memtable_offheap_space_in_mb: 2048
        internode_compression: dc
        max_hints_delivery_threads: 2
        cross_node_timeout: false
        partitioner: org.apache.cassandra.dht.Murmur3Partitioner
        tombstone_failure_threshold: 100000
        hinted_handoff_enabled: true
        hints_flush_period_in_ms: 10000
        enable_user_defined_functions: false
        hinted_handoff_throttle_in_kb: 1024
        max_hint_window_in_ms: 10800000
        auto_snapshot: true
        index_summary_resize_interval_in_minutes: 60
        range_request_timeout_in_ms: 10000
        stream_throughput_outbound_megabits_per_sec: 200
        sstable_preemptive_open_interval_in_mb: 50
        dynamic_snitch_update_interval_in_ms: 100
        trickle_fsync_interval_in_kb: 10240
        commitlog_sync_period_in_ms: 10000
        cdc_enabled: true
        max_hints_file_size_in_mb: 128
        counter_write_request_timeout_in_ms: 5000
        concurrent_counter_writes: 32
        dynamic_snitch_badness_threshold: 0.1
        permissions_validity_in_ms: 2000
        roles_validity_in_ms: 2000
        rpc_port: 9160
        cas_contention_timeout_in_ms: 1000
        thrift_framed_transport_size_in_mb: 15
        gc_warn_threshold_in_ms: 1000
        request_scheduler: org.apache.cassandra.scheduler.NoScheduler

┆Issue is synchronized with this Jira Story by Unito ┆Issue Number: K8OP-44

adejanovski commented 9 months ago

Hi, this is a common warning message with Cassandra and is not a k8ssandra related issue. I'd recommend to look for previous answers on that same question in the Cassandra mailing list or Jira for leads on how to configure it.