Open hai-nguyen-van opened 3 years ago
The above warning message could be amended in a way to insist on the fact that...
The warning message comes from devspace. Not from our code so not really anything we can do about that.
Also, you should not need to delete the docker image. It should rebuild just fine. You are using the force rebuild tag -b
.
You can skip building zerotier by commenting it out of the images list in devspace.yaml.
What are the events listed when you describe
the pods?
Mhhhhh that's really weird. I really had the impression that force rebuild using option -b
did not operate as expected. Here's what describe
displayed:
Name: tezos-baking-node-0
Namespace: tezos-testnet
Priority: 0
Node: minikube/192.168.49.2
Start Time: Fri, 23 Jul 2021 15:11:41 +0200
Labels: app=tezos-baking-node
appType=tezos-node
controller-revision-hash=tezos-baking-node-7486d4d774
statefulset.kubernetes.io/pod-name=tezos-baking-node-0
Annotations: <none>
Status: Pending
IP: 172.17.0.4
IPs:
IP: 172.17.0.4
Controlled By: StatefulSet/tezos-baking-node
Init Containers:
wait-for-bootstrap:
Container ID: docker://8828facc0f6cfa4a549cb0bf498fdd30341af45f7b25132e963b29d0654792db
Image: tezos-k8s-utils:dev
Image ID: docker://sha256:2226b4a2a75bd3cae9928a157472570ceb526c642b68d80a6d3ade68c574e16b
Port: <none>
Host Port: <none>
Args:
wait-for-bootstrap
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 23 Jul 2021 15:14:46 +0200
Finished: Fri, 23 Jul 2021 15:14:46 +0200
Ready: False
Restart Count: 5
Environment Variables from:
tezos-config ConfigMap Optional: false
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nf8zz (ro)
/var/tezos from var-volume (rw)
config-generator:
Container ID:
Image: tezos-k8s-utils:dev
Image ID:
Port: <none>
Host Port: <none>
Args:
config-generator
--generate-config-json
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment Variables from:
tezos-secret Secret Optional: false
tezos-config ConfigMap Optional: false
Environment:
MY_POD_IP: (v1:status.podIP)
MY_POD_NAME: tezos-baking-node-0 (v1:metadata.name)
MY_POD_TYPE: node
MY_NODE_CLASS: tezos-baking-node
Mounts:
/etc/tezos from config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nf8zz (ro)
/var/tezos from var-volume (rw)
Containers:
tezos-node:
Container ID:
Image: tezo:latest
Image ID:
Ports: 8732/TCP, 9732/TCP
Host Ports: 0/TCP, 0/TCP
Command:
/bin/sh
Args:
-c
set -x
set
#
# Not every error is fatal on start. In particular, with zerotier,
# the listen-addr may not yet be bound causing tezos-node to fail.
# So, we try a few times with increasing delays:
for d in 1 1 5 10 20 60 120; do
/usr/local/bin/tezos-node run \
--bootstrap-threshold 0 \
--config-file /etc/tezos/config.json
sleep $d
done
#
# Keep the container alive for troubleshooting on failures:
sleep 3600
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/etc/tezos from config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nf8zz (ro)
/var/tezos from var-volume (rw)
baker-alpha:
Container ID:
Image: tezo:latest
Image ID:
Port: <none>
Host Port: <none>
Command:
/bin/sh
Args:
-c
set -ex
TEZ_VAR=/var/tezos
TEZ_BIN=/usr/local/bin
CLIENT_DIR="$TEZ_VAR/client"
NODE_DIR="$TEZ_VAR/node"
NODE_DATA_DIR="$TEZ_VAR/node/data"
proto_command="alpha"
if [ "${DAEMON}" == "baker" ]; then
extra_args="with local node $NODE_DATA_DIR"
fi
my_baker_account="$(cat /etc/tezos/baker-account )"
CLIENT="$TEZ_BIN/tezos-client -d $CLIENT_DIR"
CMD="$TEZ_BIN/tezos-$DAEMON-$proto_command -d $CLIENT_DIR"
while ! $CLIENT rpc get chains/main/blocks/head; do
sleep 5
done
exec $CMD run ${extra_args} ${my_baker_account}
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment Variables from:
tezos-config ConfigMap Optional: false
Environment:
MY_POD_IP: (v1:status.podIP)
MY_POD_NAME: tezos-baking-node-0 (v1:metadata.name)
MY_POD_TYPE: node
MY_NODE_CLASS: tezos-baking-node
DAEMON: baker
Mounts:
/etc/tezos from config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nf8zz (ro)
/var/tezos from var-volume (rw)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
var-volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: var-volume-tezos-baking-node-0
ReadOnly: false
dev-net-tun:
Type: HostPath (bare host directory volume)
Path: /dev/net/tun
HostPathType:
config-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
default-token-nf8zz:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-nf8zz
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3m36s default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
Warning FailedScheduling 3m36s default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
Normal Scheduled 3m33s default-scheduler Successfully assigned tezos-testnet/tezos-baking-node-0 to minikube
Normal Pulled 117s (x5 over 3m32s) kubelet Container image "tezos-k8s-utils:dev" already present on machine
Normal Created 117s (x5 over 3m32s) kubelet Created container wait-for-bootstrap
Normal Started 117s (x5 over 3m32s) kubelet Started container wait-for-bootstrap
Warning BackOff 106s (x10 over 3m30s) kubelet Back-off restarting failed container
To confirm that -b
works is to see the output of the docker build as it is happening. Is docker reporting that it is using cached layers? Or is it building the layers?
It looks like there is an error in the wait-for-bootstrap
init container. Do you have the logs?
To confirm that -b works is to see the output of the docker build as it is happening. Is docker reporting that it is using cached layers? Or is it building the layers?
Ok. I will try once more to reproduce this.
It looks like there is an error in the wait-for-bootstrap init container.
Here are the logs but I have the impression it is not relevant:
$ kubectl -n tezos-testnet logs tezos-baking-node-0 wait-for-bootstrap
+ CMD=wait-for-bootstrap
+ shift
+ exec /wait-for-bootstrap.sh
jq: error (at <stdin>:33): Cannot index array with string "is_bootstrap_node"
No bootstrap nodes were provided
Here are the logs but I have the impression it is not relevant:
This is an error. It means that none of the nodes specified in values.yaml are set as a bootstrap node. So nodes don't know which node(s) to connect to after a chain has been activated. You can specify an instance
as being a bootstrap node with is_bootstrap_node
property.
nodes:
tezos-baking-node:
storage_size: 15Gi
runs:
- baker
- endorser
instances:
- bake_using_account: baker0
is_bootstrap_node: true
@hai-nguyen-van Is this resolved? Any other issues?
After discussions with @elric1, I found a strange behavior when re-building the Docker image
tezos-k8s-utils
usingdevspace build -b -t dev --skip-push
. I faced this warning and ignored it at first:When trying to deploy the testnet, I would keep having
Init:CrashLoopBackOff
without further explanations:The above warning message could be amended in a way to insist on the fact that
devspace