Nordix / Meridio

Facilitator of attraction and distribution of external traffic within Kubernetes via secondary networks
https://meridio.nordix.org
Apache License 2.0
46 stars 9 forks source link

Install Meridio on kind cluster: proxy-load-balancer-a1-xxx stuck failed Readiness probe #429

Open dezenxi opened 1 year ago

dezenxi commented 1 year ago

Hi, I'm following up instructions https://meridio.nordix.org/docs/demo/multus-kind-ovs/#installation to install Meridio on kind cluster.

However, some pods got not-ready state as below ipam-trench-a-0 1/1 Running 0 6m41s meridio-operator-548db54687-njlcw 1/1 Running 0 7m35s nsp-trench-a-0 1/1 Running 0 6m41s proxy-load-balancer-a1-9mfq6 0/1 Running 0 6m40s proxy-load-balancer-a1-p5mrg 0/1 Running 0 6m39s stateless-lb-frontend-attr-1-7b6d9d56f7-6gtd4 0/2 Init:CrashLoopBackOff 6 (35s ago) 6m41s stateless-lb-frontend-attr-1-7b6d9d56f7-9x4sv 0/2 Init:CrashLoopBackOff 6 (42s ago) 6m41s

k describe pod proxy-load-balancer-a1-9mfq6 -n red Warning Unhealthy 17s (x29 over 4m17s) kubelet Readiness probe failed: service unhealthy (responded with "NOT_SERVING")

k describe pod stateless-lb-frontend-attr-1-7b6d9d56f7-6gtd4 -n red Args: -c sysctl -w net.ipv4.conf.all.forwarding=1 ; sysctl -w net.ipv4.fib_multipath_hash_policy=1 ; sysctl -w net.ipv4.conf.all.rp_filter=0 ; sysctl -w net.ipv4.conf.default.rp_filter=0 ; sysctl -w net.ipv6.conf.all.forwarding=1 ; sysctl -w net.ipv6.fib_multipath_hash_policy=1 State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Tue, 13 Jun 2023 23:35:43 +1000 Finished: Tue, 13 Jun 2023 23:35:43 +1000 Ready: False Warning BackOff 11m (x4 over 12m) kubelet Back-off restarting failed container

I followed strictly the step in https://meridio.nordix.org/docs/demo/multus-kind-ovs/#installation Still don't know why pod of proxy and frond-end are not up/running I need your help to install Meridio on kind cluster. Is there any prerequisite when installing Meridio on kind cluster?

Regards, Duong

LionelJouin commented 1 year ago

I haven't tried these instructions recently. You can try this to setup your KinD cluster (Spire, NSM, GW/TG):

make -s -C ./docs/demo/scripts/kind

And then you can install Meridio like this:

helm install meridio-crds https://artifactory.nordix.org/artifactory/cloud-native/meridio/Meridio-CRDs-v1.0.6.tgz --create-namespace
helm install meridio https://artifactory.nordix.org/artifactory/cloud-native/meridio/Meridio-v1.0.6.tgz --create-namespace

And Multus you can install in the same way

kubectl apply -f https://raw.githubusercontent.com/Nordix/xcluster/master/ovl/multus/multus-install.yaml
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/whereabouts/master/doc/crds/daemonset-install.yaml
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/whereabouts/master/doc/crds/whereabouts.cni.cncf.io_ippools.yaml
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/whereabouts/master/doc/crds/whereabouts.cni.cncf.io_overlappingrangeipreservations.yaml
dezenxi commented 1 year ago

Hi @LionelJouin ,

Thanks for responses, I followed your instructions, but still failed at step kubectl apply -f docs/demo/multus-meridio.yaml -n red

The pods proxy-load-balancer still not ready

Readiness probe failed: service unhealthy (responded with "NOT_SERVING")

Pod stateless-lb-frontend-attr-1 were stuck at INIT/crash state as before.

Regards, Duong

LionelJouin commented 1 year ago

In that case the first thing to check is if the interface in the attractor instance exist and what is the bird status.

It seems similar to this tutorial case: https://meridio.nordix.org/training/troubleshooting-ctf/scenario-2

The logs of the frontend container could be also useful

dezenxi commented 1 year ago

Hi @LionelJouin ,

Do I need to create NetworkAttachmentDefinition meridio-nad like in ? https://meridio.nordix.org/docs/demo/multus-kind-ovs/#installation

Regards, Duong

LionelJouin commented 1 year ago

If you use a network-attachment type of interface in your attractor, then yes, you will need to create a NAD (NetworkAttachmentDefinition)

dezenxi commented 1 year ago

Hi @LionelJouin , I updated multus-meridio.yaml/Attractor to use type nsm-vlan

    name: ext-vlan0
    ipv4-prefix: 169.254.100.0/24
    ipv6-prefix: 100:100::/64
    type: nsm-vlan
    nsm-vlan:
      vlan-id: 100
      base-interface: eth0

Now, all pods are up and running

NAME                                            READY   STATUS    RESTARTS   AGE
ipam-trench-a-0                                 1/1     Running   0          36m
meridio-operator-598994c599-6mdjp               1/1     Running   0          46m
nse-vlan-attr-1-b4659bfb4-k672g                 1/1     Running   0          36m
nsp-trench-a-0                                  1/1     Running   0          36m
proxy-load-balancer-a1-lb5tx                    1/1     Running   0          36m
proxy-load-balancer-a1-n57vk                    1/1     Running   0          36m
stateless-lb-frontend-attr-1-5d4d576cc6-pmt78   3/3     Running   0          36m
stateless-lb-frontend-attr-1-5d4d576cc6-qqb97   3/3     Running   0          36m

I'll try to use network-attachment type later. Thank you very much.

Regards, Duong

dezenxi commented 1 year ago

Hi @LionelJouin , Do you think Frontend should check external interface before starting bird (bgp/bfd)? I can see in case of static config that frontend reach ready state although the interface missing. So, explicitly check and printout interface missing would help reduce a lot of time of troubleshooting, such as "Missing interface XYZ , check attractor config or CNI...."

Regards, Duong

Regards, Duong