cilium / cilium-cli

CLI to install, manage & troubleshoot Kubernetes clusters running Cilium
https://cilium.io
Apache License 2.0
426 stars 209 forks source link

Enabling `enable-local-redirect-policy` indefinitely waits on CRD registration #946

Closed jlaffaye closed 3 weeks ago

jlaffaye commented 2 years ago

Bug report

General Information

How to reproduce the issue

  1. cilium config set enable-local-redirect-policy true
  2. all agents are restarted but fail to start "waiting for all CRDs" indefinitely

restarting cilium-operator fixes the issue by creating the ciliumlocalredirectpolicies.cilium.io CRD.

Not sure if its a CLI issue that should restart cilium-operator or an operator issue that should pickup the configmap change without a restart.

Jiang1155 commented 1 year ago

I saw the same issue on 1.11 and 1.13.1. I can provide more info if needed.

brb commented 1 year ago

@aditighag MBOI

aditighag commented 1 year ago

Hi @jlaffaye Sorry, looks like the issue fell through the cracks.

restarting cilium-operator fixes the issue by creating the ciliumlocalredirectpolicies.cilium.io CRD. Not sure if its a CLI issue that should restart cilium-operator or an operator issue that should pickup the configmap change without a restart.

That's a fair point! cilium config set internally restarts cilium agent pods by default, so maybe it also makes sense to restart cilium-operator. Cilium operator is tasked with registering all CRDs, and so you may see this issue for other features as well. I'll bring this up in the community meeting to get more insights. You are welcome to join the discussion.

atsai1220 commented 7 months ago

In case anyone runs into this from the Internet, restarting Cilium operator got the ball rolling.

We got stuck configuring CiliumLocalRedirectPolicy. In our environment with Cilium 1.14.4, we had to specify .spec.redirectFrontend.serviceMatcher.toPorts otherwise the LRP list does not show the endpoints and traffic is not routed to node-local dns cache.

Here are the steps to verify:

  1. Create this CiliumLocalRedirectPolicy without .spec.redirectFrontend.serviceMatcher.toPorts. This is as described in the documentation and the provided example.
apiVersion: cilium.io/v2
kind: CiliumLocalRedirectPolicy
metadata:
  name: node-local-dns
  namespace: kube-system
spec:
  redirectBackend:
    localEndpointSelector:
      matchLabels:
        k8s-app: node-local-dns
    toPorts:
    - name: dns
      port: "53"
      protocol: UDP
    - name: dns-tcp
      port: "53"
      protocol: TCP
  redirectFrontend:
    serviceMatcher:
      namespace: kube-system
      serviceName: rke2-coredns-rke2-coredns
  1. View your lrp list. You will not see your pods on the right side of the arrow and traffic is not directed to node-local dns cache.
❯ kubectl -n kube-system exec ds/cilium -- cilium lrp list
LRP namespace   LRP name         FrontendType                Matching Service
kube-system     node-local-dns   clusterIP + all svc ports   kube-system/rke2-coredns-rke2-coredns
                |                10.43.0.10:53/UDP ->
                |                10.43.0.10:53/TCP ->
  1. Delete your existing lrp. At the time of writing, there is a limitation that prevents modifications to an existing lrp.

    kubectl delete ciliumlocalredirectpolicies -n kube-system node-local-dns
  2. Create a new CiliumLocalRedirectPolicy with .spec.redirectFrontend.serviceMatcher.toPorts.

    apiVersion: cilium.io/v2
    kind: CiliumLocalRedirectPolicy
    metadata:
    name: node-local-dns
    namespace: kube-system
    spec:
    redirectBackend:
    localEndpointSelector:
      matchLabels:
        k8s-app: node-local-dns
    toPorts:
    - name: dns
      port: "53"
      protocol: UDP
    - name: dns-tcp
      port: "53"
      protocol: TCP
    redirectFrontend:
    serviceMatcher:
      namespace: kube-system
      serviceName: rke2-coredns-rke2-coredns
      toPorts:
        - name: dns-tcp
          port: "53"
          protocol: TCP
        - name: dns
          port: "53"
          protocol: UDP
  3. Verify your lrp list has backend pods and traffic routed to your node-local dns cache. The output will show your node-local-dns pod IPs.

❯ kubectl -n kube-system exec ds/cilium -- cilium lrp list
LRP namespace   LRP name         FrontendType              Matching Service
kube-system     node-local-dns   clusterIP + named ports   kube-system/rke2-coredns-rke2-coredns
                |                10.43.0.10:53/TCP -> 10.42.3.193:53(kube-system/node-local-dns-sbtdr),
                |                10.43.0.10:53/UDP -> 10.42.3.193:53(kube-system/node-local-dns-sbtdr),
github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

github-actions[bot] commented 3 weeks ago

This issue has not seen any activity since it was marked stale. Closing.