Open yitzhtal opened 8 months ago
Try the latest version 1.0.2
I upgraded to 1.0.2 and used node selector for more stable nodes (not spot instances). It works now, see if it'll be stable, I'll update
close the issue if it sorted
I still can't seem to make Jaeger stable, I got this errors:
ERROR [main] 2024-04-11 08:29:47,486 CassandraDaemon.java:774 - Exception encountered during startup │
│ java.lang.RuntimeException: A node required to move the data consistently is down (/10.50.13.161). If you wish to move the data from a potentially inconsistent replica, restart the node with -Dcassandra.consistent.rangemovement=false │
│ at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:294) ~[apache-cassandra-3.11.6.jar:3.11.6] │
│ at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:177) ~[apache-cassandra-3.11.6.jar:3.11.6] │
│ at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:87) ~[apache-cassandra-3.11.6.jar:3.11.6] │
│ at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1530) ~[apache-cassandra-3.11.6.jar:3.11.6] │
│ at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1024) ~[apache-cassandra-3.11.6.jar:3.11.6] │
│ at org.apache.cassandra.service.StorageService.initServer(StorageService.java:718) ~[apache-cassandra-3.11.6.jar:3.11.6] │
│ at org.apache.cassandra.service.StorageService.initServer(StorageService.java:652) ~[apache-cassandra-3.11.6.jar:3.11.6] │
│ at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:397) [apache-cassandra-3.11.6.jar:3.11.6] │
│ at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:630) [apache-cassandra-3.11.6.jar:3.11.6] │
│ at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:757) [apache-cassandra-3.11.6.jar:3.11.6] │
│ INFO [StorageServiceShutdownHook] 2024-04-11 08:29:47,488 HintsService.java:209 - Paused hints dispatch │
│ WARN [StorageServiceShutdownHook] 2024-04-11 08:29:47,488 Gossiper.java:1655 - No local state, state is in silent shutdown, or node hasn't joined, not announcing shutdown │
│ INFO [StorageServiceShutdownHook] 2024-04-11 08:29:47,488 MessagingService.java:985 - Waiting for messaging service to quiesce │
│ INFO [ACCEPT-/10.50.10.10] 2024-04-11 08:29:47,489 MessagingService.java:1346 - MessagingService has terminated the accept() thread
looks similar. ran into this with one of the pod keeps crashing
with the 3.0.10
chart
jaeger-cassandra-0 1/1 Running 0 13d 10.0.3.24 c21 <none> <none>
jaeger-cassandra-1 0/1 CrashLoopBackOff 6 (2m7s ago) 12m 10.0.10.216 c34 <none> <none>
jaeger-cassandra-2 1/1 Running 0 46d 10.0.0.47 p11 <none> <none>
What happened?
Cassandra stateful set is not stable and keeps crashing.
Steps to reproduce
Expected behavior
Jaeger available with alll pods running stable.
Relevant log output
Screenshot
Additional context
Running Jaeger on a dedicated namespace on EKS.
Jaeger backend version
1.53.0
SDK
OpenTelemetry SDK.
Pipeline
No response
Stogage backend
Cassandra
Operating system
Linux
Deployment model
Kubernetes
Deployment configs