Open kertzi opened 1 year ago
Stargate is crash looping most probably because it doesn't support Cassandra 4.1 yet: https://github.com/stargate/stargate/issues/2311
FTR, resource requirements for the Stargate pods can (and should) be set explicitly under .spec.stargate.resources
: https://docs.k8ssandra.io/reference/crd/k8ssandra-operator-crds-latest/#stargatespecresources
Thanks for fast response! I see, I will test with cassandra 4.0
Yes crashloop stops when I downgraded cassandra version to 4.0.8 Its still wierd that java heap sizes are set wrong compared to definitions in manifest:
Containers:
integration-cassandra-dc1-default-stargate-deployment:
Container ID: docker://22fa87a8f60b18b775a7c0367dfcaafa230ec1b79eaa8ccf71d12c5cd855eec6
Image: docker.io/stargateio/stargate-4_0:v1.0.67
Image ID: docker-pullable://stargateio/stargate-4_0@sha256:eec28880951dcc45e6925e4ff3d6d0538b124f0607cf77d660bea64ce230b1e3
Ports: 8080/TCP, 8081/TCP, 8082/TCP, 8084/TCP, 8085/TCP, 8090/TCP, 9042/TCP, 8609/TCP, 7000/TCP, 7001/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
State: Running
Started: Tue, 04 Apr 2023 16:24:05 +0300
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 04 Apr 2023 16:18:26 +0300
Finished: Tue, 04 Apr 2023 16:18:54 +0300
Ready: True
Restart Count: 72
Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 200m
memory: 512Mi
Liveness: http-get http://:health/checker/liveness delay=30s timeout=10s period=10s #success=1 #failure=5
Readiness: http-get http://:health/checker/readiness delay=30s timeout=10s period=10s #success=1 #failure=5
Environment:
LISTEN: (v1:status.podIP)
JAVA_OPTS: -XX:+CrashOnOutOfMemoryError -Xms268435456 -Xmx268435456 -Dstargate.unsafe.cassandra_config_path=/config/cassandra.yaml -Dstargate.cql.config_path=/config/stargate-cql.yaml
CLUSTER_NAME: integration-cassandra
CLUSTER_VERSION: 4.0
SEED: integration-cassandra-seed-service.default.svc
DATACENTER_NAME: dc1
RACK_NAME: default
DISABLE_BUNDLES_WATCH: true
ENABLE_AUTH: true
Or do I miss something?
we don't change the memory requirements for the container based on the heap size. I guess that would be a nice improvement as we can predict the maximum memory that the cassandra container should be allowed to use, based on the heap size. But just to be clear, the heap size setting will be used to configure the Cassandra process, and resource requirements are set at the container definition level. You can override each of these settings to have them match each other (plan for a bit more than twice the heap size in memory limit).
What happened? I'm trying to deploy k8ssandra to EKS. Stargate end up crashLoopBackOff loop with erro 137 (OOM kill).
I think reason for this is because container limits are not matched to java heap size settings. Now container memory is limited to 1Gi and java heap set to 2GB.
Did you expect to see something different? Stargate not crashing
How to reproduce it (as minimally and precisely as possible): Here is my K8ssandraCluster yaml. This is almost direct copy from examples.
Here is part of Pod config from cluster after above deployed.
Environment AWS EKS with k8s version 1.23
Server Version: version.Info{Major:"1", Minor:"23+", GitVersion:"v1.23.16-eks-48e63af", GitCommit:"e6332a8a3feb9e0fe3db851878f88cb73d49dd7a", GitTreeState:"clean", BuildDate:"2023-01-24T19:18:15Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}