Pods stuck in Init state, created container wait-nrf

ritokispingvin commented 2 years ago

Hi, after installing the free-5gc project with helm, some of my pods are stucking in Init state and even after waiting minutes they don't come up. Sometimes only 2 hangs but sometimes 4-5 pods are hanging. I saw that yesterday some update was made probably about this issue: "Fix initContainer curl command waiting for NRF ready / Add --insecure…" but I'm still experiencing the issue. Can you please help me how to overcome this issue?

Thank you and best regards, ritokispingvin

raoufkh commented 2 years ago

Hello!

Can you provide following information please?

Architecture of your K8s cluster (how many nodes for the control plane and how many workers)
The result of kubectl -n <your-namespace-there> get po -o wide
Logs from init containers on Pods stucking in init state (e.g. kubectl -n <your-namespace-there> logs <pod-name> -c wait-nrf

Regards, Abderaouf

ritokispingvin commented 2 years ago

Hi,

please find the requested logs below. 192.168.56.103 is my master nodes, .102 and .101 are my worker ones. Looks like the problem is with the pods running on .102.

ubuntu@ubuntu:~$ kubectl -n pvolume logs v3.0.6-free5gc-pcf-pcf-646cc9b75f-gmpsd -c wait-nrf

curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies waiting for dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies waiting for dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ] waiting for dependencies
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ] waiting for dependencies
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies waiting for dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies waiting for dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies waiting for dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1 waiting for dependencies
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 waiting for dependencies
'[' 000 -ne 200 ]
echo waiting 'for' dependencies
sleep 1
curl --connect-timeout 1 -s -o /dev/null -w '%{http_code}' http://nrf-nnrf:8000 ubuntu@ubuntu:~$

Best Regards, ritokispingvin

raoufkh commented 2 years ago

Other Pods than UPF, mongo and webui wait for the NRF to be ready in the init phase. It seems like only Pods which are scheduled on another node than the one where NRF is scheduled stuck in init state. Are you sure that in your cluster, communications between Pods on different worker nodes is possible?

If it is a cluster for testing, can you trie draining and then removing the .102 node from the cluster to check?

Another option to check is to set the nodeSelector field to the .101 node labels on all deployments.

NOTE: all our Helm charts provide the possibility to customize this field (e.g. free5gc-amf.amf.nodeSelector if you want to do it for AMF from the main Helm chart), but it will take a little bit more time than the first approach.

Regards, Abderaouf

ritokispingvin commented 2 years ago

I'm able to SSH from all node into all other nodes without password so connectivity should be ok. After customizing the nodeSelector field to use .101 all pods are running now. Ticket can be closed, many thanks for the tips!

Best Regards, ritokispingvin

raoufkh commented 2 years ago

Great! However, scheduling all pods on the same node is a troubleshooting solution. The ideal solution is to fix the connectivity problem between Pods that are on different nodes. This can be related to DNS problems or the CNI used.

Orange-OpenSource / towards5gs-helm

Pods stuck in Init state, created container wait-nrf #13