Closed palladius closed 1 year ago
From Leonid:
The investigation is still in progress. So far I can confirm that the cause of the problem is init container in the cartservice pod that fails. The workaround of the problem is to delete the cartservice deployment. Ensure that redis-cart deployment and service are in ready state. Delete the initContainers section from the cartservice.yaml (in the kubernetes-manifests/ folder and re-deploy the cartservice
the part which needs to be removed is
initContainers:
- command:
- bin/sh
- -c
- until nslookup redis-cart; do echo waiting for redis; sleep 2; done;
image: busybox
imagePullPolicy: Always
name: init-redis-ready
Alex and I noticed that this command returns correctly on main container but poorly on the init container:
Server: 10.28.0.10
Address: 10.28.0.10:53
Non-authoritative answer:
Name: redis-cart.default.svc.cluster.local
Address: 10.28.2.181
** server can't find redis-cart.svc.cluster.local: NXDOMAIN
** server can't find redis-cart.cluster.local: NXDOMAIN
** server can't find redis-cart.cluster.local: NXDOMAIN
** server can't find redis-cart.svc.cluster.local: NXDOMAIN
** server can't find redis-cart.google.internal: NXDOMAIN
** server can't find redis-cart.google.internal: NXDOMAIN
** server can't find redis-cart.c.cloud-ops-sandbox-2646743255.internal: NXDOMAIN
** server can't find redis-cart.c.cloud-ops-sandbox-2646743255.internal: NXDOMAIN
/app # echo $?
0
It would incorrectly return 1 on the init (where the SHELL env was slightly different, maybe a differen versioj busybox? Leonid suggests it might be a bug in busybox and I agree.
I can confirm this change works:
initContainers:
- name: init-redis-ready-riccardo
# There is a bug in busybox that prevents us from returning 0 when redis is available and multiple addresses are in /etc/resolv.conf :/
image: busybox
command: ['bin/sh', '-c', 'until nslookup redis-cart|grep Address: ; do echo Waiting for redis BUG in busybox; sleep 2; done;']
#command: ['bin/sh', '-c', 'echo OK Ric04 just ok']
containers:
I'll try now also the 1.28 version as per here: https://www.linkedin.com/pulse/busybox-nslookup-bug-gary-tay/
YES! The
- name: init-redis-ready-riccardo128
# There is a bug in busybox that prevents us from returning 0 when redis is available and multiple addresses are in /etc/resolv.conf
image: busybox:1.28
#command: ['bin/sh', '-c', 'until nslookup redis-cart|grep Address: ; do echo Waiting for redis BUG in busybox; sleep 2; done;']
command: ['bin/sh', '-c', 'until nslookup redis-cart ; do echo Waiting for redis BUG in busybox; sleep 2; done;']
also works.
urrently lastest version of https://cloud-ops-sandbox.dev/ is broken.
A fresh isntall fails on the cartservice.
I've done a long investigation with @alml in [1] shared with leoy@
A quick/cheap fix would be good enough.
[1] https://docs.google.com/document/d/1RTEKaDlP9PwoNfKvpAFxjbQYCZ0o5kA9Pj_V26kqu3Y/edit# [2]