Open maaaaaaav opened 4 years ago
Hello @maaaaaaav, sometimes while pods are in crash loop, the logs can be from just before it crashed. The logs that you've added don't look like failure. Can you please check the logs of a couple of times again see if there is anything new?
I had the same issue especially when trying to configure own smtp server using https://github.com/wireapp/wire-server-deploy/issues/266. Below are the warnings and failure messages inside kubectl describe pod.
Normal Scheduled 4m22s default-scheduler Successfully assigned production/brig-69969b5bdc-ndn8b to kubenode02
Warning Unhealthy 3m29s (x5 over 4m9s) kubelet, kubenode02 Readiness probe failed: Get http://10.233.65.172:8080/i/status: dial tcp 10.233.65.172:8080: connect: connection refused
Normal Pulling 3m23s (x3 over 4m20s) kubelet, kubenode02 Pulling image "quay.io/wire/brig:latest"
Warning Unhealthy 3m23s (x6 over 4m13s) kubelet, kubenode02 Liveness probe failed: Get http://10.233.65.172:8080/i/status: dial tcp 10.233.65.172:8080: connect: connection refused
Normal Killing 3m23s (x2 over 3m53s) kubelet, kubenode02 Container brig failed liveness probe, will be restarted
Normal Pulled 3m22s (x3 over 4m16s) kubelet, kubenode02 Successfully pulled image "quay.io/wire/brig:latest"
Normal Created 3m22s (x3 over 4m16s) kubelet, kubenode02 Created container brig
Normal Started 3m22s (x3 over 4m15s) kubelet, kubenode02 Started container brig
@ramesh8830 Do you also see nothing interesting in kubectl logs
for the brig pods?
Hi @akshaymankar
Nothing there in the kubectl logs
for brig pods. Brig pods keep waiting on the Ready Status
and eventually fall into CrashLoopBackOff
.
I am also getting same log @maaaaaaav reported in his post showing all cassandra nodes.
wireadmin@wire-controller:~/wire-server-deploy/ansible$ kubectl logs brig-8674744bc7-ccbtf
{"logger":"cassandra.brig","msgs":["I","Known hosts: [datacenter1:rack1:172.16.32.31:9042,datacenter1:rack1:172.16.32.32:9042,datacenter1:rack1:172.16.32.33:9042]"]}
{"logger":"cassandra.brig","msgs":["I","New control connection: datacenter1:rack1:172.16.32.33:9042#<socket: 11>"]}
Warning Unhealthy 3m23s (x6 over 4m13s) kubelet, kubenode02 Liveness probe failed: Get http://10.233.65.172:8080/i/status: dial tcp 10.233.65.172:8080: connect: connection refused
This indicates that brig is taking some time to come up and K8s is not patient enough for that. Usually brig prints a line like this when it starts listening on the port:
I, Listening on 0.0.0.0:8080
I would make sure that the pod is getting enough CPU/RAM. And if that is the case, I would bump up the logging in brig to Debug
or even Trace
and see if I find anything in the logs. Hope this helps!
Its has enough CPU/RAM. It happens only when we use user name password for SMTP configuration other than demo credentials. If I use demo credentials for smtp then brig pods running successfully.
Hi there,
thanks again for all the help and assistance.
Currently trying to deploy wire-server using helm; everything is working fine except for when the brig kubes are stopping on CrashLoopBackOff
When i pull the logs, this is all I get:
those are the correct IPs for my three cassandra nodes and they seem to be up fine. I'm using cassandra-external to point them there.
any guidance as to what I should upload to help with this would be much appreciated too.
Thanks!