canonical / cassandra-k8s-operator

Apache License 2.0
1 stars 4 forks source link

OutOfMemoryError crash-loop is misleadingly displayed as "Waiting for Database" #55

Open sed-i opened 2 years ago

sed-i commented 2 years ago

Describe the bug If you deploy the charm with an insufficient heap size, the charm is in a crash loop but juju status shows "Waiting for Database".

To Reproduce Deploy cassandra with --config heap_size=10m

Expected behavior Status goes into Blocked with "insufficient memory".

Logs ``` 2022-03-24T13:27:15.620Z [cassandra] INFO [main] 2022-03-24 13:27:15,620 DatabaseDescriptor.java:381 - DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap 2022-03-24T13:27:15.621Z [cassandra] INFO [main] 2022-03-24 13:27:15,621 DatabaseDescriptor.java:439 - Global memtable on-heap threshold is enabled at 2MB 2022-03-24T13:27:15.622Z [cassandra] INFO [main] 2022-03-24 13:27:15,621 DatabaseDescriptor.java:443 - Global memtable off-heap threshold is enabled at 2MB 2022-03-24T13:27:15.951Z [cassandra] WARN [main] 2022-03-24 13:27:15,950 DatabaseDescriptor.java:579 - Only 33.123GiB free across all data volumes. Consider adding more capacity to your cluster or removing obsolete snapshots 2022-03-24T13:27:16.010Z [cassandra] INFO [main] 2022-03-24 13:27:16,009 RateBasedBackPressure.java:123 - Initialized back-pressure with high ratio: 0.9, factor: 5, flow: FAST, window size: 2000. 2022-03-24T13:27:16.010Z [cassandra] INFO [main] 2022-03-24 13:27:16,010 DatabaseDescriptor.java:781 - Back-pressure is disabled with strategy null. 2022-03-24T13:27:16.353Z [cassandra] INFO [main] 2022-03-24 13:27:16,352 GossipingPropertyFileSnitch.java:64 - Loaded cassandra-topology.properties for compatibility 2022-03-24T13:27:16.455Z [cassandra] INFO [main] 2022-03-24 13:27:16,455 JMXServerUtils.java:253 - Configured JMX server at: service:jmx:rmi://127.0.0.1/jndi/rmi://127.0.0.1:7199/jmxrmi 2022-03-24T13:27:16.463Z [cassandra] INFO [main] 2022-03-24 13:27:16,462 CassandraDaemon.java:490 - Hostname: cassandra-0.cassandra-endpoints.test-rerelate-alertmanager-dispatch-d6xh.svc.cluster.local 2022-03-24T13:27:16.464Z [cassandra] INFO [main] 2022-03-24 13:27:16,463 CassandraDaemon.java:497 - JVM vendor/version: OpenJDK 64-Bit Server VM/1.8.0_322 2022-03-24T13:27:16.465Z [cassandra] INFO [main] 2022-03-24 13:27:16,464 CassandraDaemon.java:498 - Heap size: 9.063MiB/9.063MiB 2022-03-24T13:27:16.466Z [cassandra] INFO [main] 2022-03-24 13:27:16,465 CassandraDaemon.java:503 - Code Cache Non-heap memory: init = 2555904(2496K) used = 4285248(4184K) committed = 4325376(4224K) max = 251658240(245760K) 2022-03-24T13:27:16.466Z [cassandra] INFO [main] 2022-03-24 13:27:16,466 CassandraDaemon.java:503 - Metaspace Non-heap memory: init = 0(0K) used = 21122192(20627K) committed = 21626880(21120K) max = -1(-1K) 2022-03-24T13:27:16.467Z [cassandra] INFO [main] 2022-03-24 13:27:16,467 CassandraDaemon.java:503 - Compressed Class Space Non-heap memory: init = 0(0K) used = 2569904(2509K) committed = 2752512(2688K) max = 1073741824(1048576K) 2022-03-24T13:27:16.467Z [cassandra] INFO [main] 2022-03-24 13:27:16,467 CassandraDaemon.java:503 - Par Eden Space Heap memory: init = 8454144(8256K) used = 8454144(8256K) committed = 8454144(8256K) max = 8454144(8256K) 2022-03-24T13:27:16.468Z [cassandra] INFO [main] 2022-03-24 13:27:16,468 CassandraDaemon.java:503 - Par Survivor Space Heap memory: init = 983040(960K) used = 983040(960K) committed = 983040(960K) max = 983040(960K) 2022-03-24T13:27:16.468Z [cassandra] INFO [main] 2022-03-24 13:27:16,468 CassandraDaemon.java:503 - CMS Old Gen Heap memory: init = 65536(64K) used = 65496(63K) committed = 65536(64K) max = 65536(64K) 2022-03-24T13:27:16.469Z [cassandra] INFO [main] 2022-03-24 13:27:16,468 CassandraDaemon.java:505 - Classpath: /etc/cassandra:/opt/cassandra/build/classes/main:/opt/cassandra/build/classes/thrift:/opt/cassandra/lib/HdrHistogram-2.1.9.jar:/opt/cassandra/lib/ST4-4.0.8.jar:/opt/cassandra/lib/airline-0.6.jar:/opt/cassandra/lib/antlr-runtime-3.5.2.jar:/opt/cassandra/lib/apache-cassandra-3.11.12.jar:/opt/cassandra/lib/apache-cassandra-thrift-3.11.12.jar:/opt/cassandra/lib/asm-5.0.4.jar:/opt/cassandra/lib/caffeine-2.2.6.jar:/opt/cassandra/lib/cassandra-driver-core-3.0.1-shaded.jar:/opt/cassandra/lib/commons-cli-1.1.jar:/opt/cassandra/lib/commons-codec-1.9.jar:/opt/cassandra/lib/commons-lang3-3.1.jar:/opt/cassandra/lib/commons-math3-3.2.jar:/opt/cassandra/lib/compress-lzf-0.8.4.jar:/opt/cassandra/lib/concurrent-trees-2.4.0.jar:/opt/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/opt/cassandra/lib/disruptor-3.0.1.jar:/opt/cassandra/lib/ecj-4.4.2.jar:/opt/cassandra/lib/guava-18.0.jar:/opt/cassandra/lib/high-scale-lib-1.0.6.jar:/opt/cassandra/lib/hppc-0.5.4.jar:/opt/cassandra/lib/jackson-annotations-2.12.5.jar:/opt/cassandra/lib/jackson-core-2.12.5.jar:/opt/cassandra/lib/jackson-databind-2.12.5.jar:/opt/cassandra/lib/jamm-0.3.0.jar:/opt/cassandra/lib/javax.inject-1.jar:/opt/cassandra/lib/jbcrypt-0.4.jar:/opt/cassandra/lib/jcl-over-slf4j-1.7.7.jar:/opt/cassandra/lib/jctools-core-1.2.1.jar:/opt/cassandra/lib/jflex-1.6.0.jar:/opt/cassandra/lib/jna-4.2.2.jar:/opt/cassandra/lib/joda-time-2.4.jar:/opt/cassandra/lib/json-simple-1.1.jar:/opt/cassandra/lib/libthrift-0.9.2.jar:/opt/cassandra/lib/log4j-over-slf4j-1.7.7.jar:/opt/cassandra/lib/logback-classic-1.2.9.jar:/opt/cassandra/lib/logback-core-1.2.9.jar:/opt/cassandra/lib/lz4-1.3.0.jar:/opt/cassandra/lib/metrics-core-3.1.5.jar:/opt/cassandra/lib/metrics-jvm-3.1.5.jar:/opt/cassandra/lib/metrics-logback-3.1.5.jar:/opt/cassandra/lib/netty-all-4.0.44.Final.jar:/opt/cassandra/lib/ohc-core-0.4.4.jar:/opt/cassandra/lib/ohc-core-j8-0.4.4.jar:/opt/cassandra/lib/reporter-config-base-3.0.3.jar:/opt/cassandra/lib/reporter-config3-3.0.3.jar:/opt/cassandra/lib/sigar-1.6.4.jar:/opt/cassandra/lib/slf4j-api-1.7.7.jar:/opt/cassandra/lib/snakeyaml-1.26.jar:/opt/cassandra/lib/snappy-java-1.1.1.7.jar:/opt/cassandra/lib/snowball-stemmer-1.3.0.581.1.jar:/opt/cassandra/lib/stream-2.5.2.jar:/opt/cassandra/lib/thrift-server-0.3.7.jar:/opt/cassandra/lib/jsr223/*/*.jar::/opt/cassandra/lib/jamm-0.3.0.jar 2022-03-24T13:27:16.469Z [cassandra] INFO [main] 2022-03-24 13:27:16,469 CassandraDaemon.java:507 - JVM Arguments: [-Xms10m, -Xmx10m, -Xloggc:/opt/cassandra/logs/gc.log, -ea, -XX:+UseThreadPriorities, -XX:ThreadPriorityPolicy=42, -XX:+HeapDumpOnOutOfMemoryError, -Xss256k, -XX:StringTableSize=1000003, -XX:+AlwaysPreTouch, -XX:-UseBiasedLocking, -XX:+UseTLAB, -XX:+ResizeTLAB, -XX:+UseNUMA, -XX:+PerfDisableSharedMem, -Djava.net.preferIPv4Stack=true, -XX:+UseParNewGC, -XX:+UseConcMarkSweepGC, -XX:+CMSParallelRemarkEnabled, -XX:SurvivorRatio=8, -XX:MaxTenuringThreshold=1, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:CMSWaitDuration=10000, -XX:+CMSParallelInitialMarkEnabled, -XX:+CMSEdenChunksRecordAlways, -XX:+CMSClassUnloadingEnabled, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintHeapAtGC, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -XX:+PrintPromotionFailure, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=10, -XX:GCLogFileSize=10M, -Xmn400M, -XX:+UseCondCardMark, -XX:CompileCommandFile=/etc/cassandra/hotspot_compiler, -javaagent:/opt/cassandra/lib/jamm-0.3.0.jar, -Dcassandra.jmx.local.port=7199, -Dcom.sun.management.jmxremote.authenticate=false, -Dcom.sun.management.jmxremote.password.file=/etc/cassandra/jmxremote.password, -Djava.library.path=/opt/cassandra/lib/sigar-bin, -Dcassandra.libjemalloc=/usr/local/lib/libjemalloc.so, -XX:OnOutOfMemoryError=kill -9 %p, -Dlogback.configurationFile=logback.xml, -Dcassandra.logdir=/opt/cassandra/logs, -Dcassandra.storagedir=/opt/cassandra/data, -Dcassandra-foreground=yes] 2022-03-24T13:27:16.573Z [cassandra] WARN [main] 2022-03-24 13:27:16,573 NativeLibrary.java:189 - Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out, especially with mmapped I/O enabled. Increase RLIMIT_MEMLOCK or run Cassandra as root. 2022-03-24T13:27:16.573Z [cassandra] INFO [main] 2022-03-24 13:27:16,573 StartupChecks.java:140 - jemalloc seems to be preloaded from /usr/local/lib/libjemalloc.so 2022-03-24T13:27:16.574Z [cassandra] WARN [main] 2022-03-24 13:27:16,574 StartupChecks.java:169 - JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info. 2022-03-24T13:27:16.576Z [cassandra] INFO [main] 2022-03-24 13:27:16,576 SigarLibrary.java:44 - Initializing SIGAR library 2022-03-24T13:27:16.585Z [cassandra] INFO [main] 2022-03-24 13:27:16,585 SigarLibrary.java:53 - Could not initialize SIGAR library No such file or directory 2022-03-24T13:27:16.585Z [cassandra] INFO [main] 2022-03-24 13:27:16,585 SigarLibrary.java:185 - Sigar could not be initialized, test for checking degraded mode omitted. 2022-03-24T13:27:16.586Z [cassandra] WARN [main] 2022-03-24 13:27:16,586 StartupChecks.java:311 - Maximum number of memory map areas per process (vm.max_map_count) 65530 is too low, recommended value: 1048575, you can change it with sysctl. 2022-03-24T13:27:16.757Z [cassandra] INFO [main] 2022-03-24 13:27:16,757 QueryProcessor.java:121 - Initialized prepared statement caches with 10 MB (native) and 10 MB (Thrift) 2022-03-24T13:27:17.875Z [cassandra] INFO [main] 2022-03-24 13:27:17,873 ColumnFamilyStore.java:432 - Initializing system.IndexInfo 2022-03-24T13:27:17.943Z [pebble] 10.128.0.2:52398 GET /v1/health?level=ready 77.408µs 200 2022-03-24T13:27:17.943Z [pebble] 10.128.0.2:52392 GET /v1/health?level=alive 50.8µs 200 2022-03-24T13:27:19.673Z [cassandra] java.lang.OutOfMemoryError: Java heap space 2022-03-24T13:27:19.673Z [cassandra] Dumping heap to java_pid2001.hprof ... 2022-03-24T13:27:19.673Z [cassandra] Unable to create java_pid2001.hprof: Permission denied 2022-03-24T13:27:19.741Z [cassandra] # 2022-03-24T13:27:19.741Z [cassandra] # java.lang.OutOfMemoryError: Java heap space 2022-03-24T13:27:19.741Z [cassandra] # -XX:OnOutOfMemoryError="kill -9 %p" 2022-03-24T13:27:19.741Z [cassandra] # Executing /bin/sh -c "kill -9 2001"... ```