cilium / cilium-service-mesh-beta

Instructions and issue tracking for Service Mesh capabilities of Cilium
Apache License 2.0
104 stars 14 forks source link

Install attaches wrong hubble relay image version #17

Closed youngjeong46 closed 2 years ago

youngjeong46 commented 2 years ago

Is there an existing issue for this?

What happened?

I ran the installation instruction for Cilium Service Mesh beta according to here and it does not attach the right image version for hubble-relay.

Specifically when I run this command:

cilium install --version -service-mesh:v1.11.0-beta.1 --config enable-envoy-config=true --kube-proxy-replacement=probe
cilium hubble enable

The output indicates that it tags it incorrectly to quay.io/cilium/hubble-relay:-service-mesh:v1.11.0-beta.1: 1:

✅ Cilium was successfully installed! Run 'cilium status' to view installation health
❯ cilium hubble enable
🔑 Found CA in secret cilium-ca
✨ Patching ConfigMap cilium-config to enable Hubble...
♻️  Restarted Cilium pods
⌛ Waiting for Cilium to become ready before deploying other Hubble component(s)...
🔑 Generating certificates for Relay...
✨ Deploying Relay from quay.io/cilium/hubble-relay:-service-mesh:v1.11.0-beta.1...
⌛ Waiting for Hubble to be installed...
    /¯¯\
 /¯¯\__/¯¯\    Cilium:         OK
 \__/¯¯\__/    Operator:       OK
 /¯¯\__/¯¯\    Hubble:         1 errors, 1 warnings
 \__/¯¯\__/    ClusterMesh:    disabled
    \__/

DaemonSet         cilium             Desired: 4, Ready: 4/4, Available: 4/4
Deployment        cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
Deployment        hubble-relay       Desired: 1, Unavailable: 1/1
Containers:       cilium             Running: 4
                  cilium-operator    Running: 1
                  hubble-relay       Pending: 1
Cluster Pods:     17/17 managed by Cilium
Image versions    hubble-relay       quay.io/cilium/hubble-relay:-service-mesh:v1.11.0-beta.1: 1
                  cilium             quay.io/cilium/cilium-service-mesh:v1.11.0-beta.1: 4
                  cilium-operator    quay.io/cilium/operator-aws-service-mesh:v1.11.0-beta.1: 1
Errors:           hubble-relay       hubble-relay                     1 pods of Deployment hubble-relay are not ready
Warnings:         hubble-relay       hubble-relay-79cb8b485c-hhd4v    pod is pending

Error: Unable to enable Hubble:  timeout while waiting for status to become successful: context deadline exceeded

Manually correcting it using kubectl edit solves the issue.

Cilium Version

cilium-cli: v0.10.1 compiled with go1.17.6 on darwin/amd64
cilium image (default): v1.11.1
cilium image (stable): v1.11.1
cilium image (running): -service-mesh:v1.11.0-beta.1

Kernel Version

Darwin 3c22fbbff4ba.ant.amazon.com 19.6.0 Darwin Kernel Version 19.6.0: Sun Nov 14 19:58:51 PST 2021; root:xnu-6153.141.50~1/RELEASE_X86_64 x86_64

Kubernetes Version

Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.5", GitCommit:"5c99e2ac2ff9a3c549d9ca665e7bc05a3e18f07e", GitTreeState:"clean", BuildDate:"2021-12-16T08:38:33Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.5-eks-bc4871b", GitCommit:"5236faf39f1b7a7dabea8df12726f25608131aa9", GitTreeState:"clean", BuildDate:"2021-10-29T23:32:16Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}

Sysdump

🔍 Collecting Kubernetes nodes 🔍 Collect Kubernetes nodes 🔍 Collecting Kubernetes events 🔍 Collecting Kubernetes pods 🔍 Collecting Kubernetes services 🔍 Collecting Cilium network policies 🔍 Collecting Cilium egress NAT policies 🔍 Collecting Cilium endpoints 🔍 Collecting Kubernetes network policies 🔍 Collecting Kubernetes pods summary 🔍 Collecting Cilium cluster-wide network policies 🔍 Collecting Kubernetes namespaces 🔍 Collect Kubernetes version 🔍 Collecting Cilium local redirect policies 🔍 Collecting Cilium identities 🔍 Collecting Cilium nodes 🔍 Collecting Cilium etcd secret 🔍 Collecting the Cilium configuration 🔍 Collecting the Cilium daemonset 🔍 Collecting the Hubble daemonset 🔍 Collecting the Hubble Relay configuration 🔍 Collecting the Hubble Relay deployment 🔍 Collecting the Hubble UI deployment 🔍 Collecting the Cilium operator deployment 🔍 Collecting the 'clustermesh-apiserver' deployment 🔍 Collecting gops stats from Cilium pods 🔍 Collecting gops stats from Hubble pods 🔍 Collecting 'cilium-bugtool' output from Cilium pods 🔍 Collecting gops stats from Hubble Relay pods 🔍 Collecting logs from Cilium pods 🔍 Collecting logs from Cilium operator pods 🔍 Collecting logs from 'clustermesh-apiserver' pods ⚠️ Deployment "clustermesh-apiserver" not found in namespace "kube-system" - this is expected if 'clustermesh-apiserver' isn't enabled 🔍 Collecting logs from Hubble pods 🔍 Collecting logs from Hubble Relay pods 🔍 Collecting logs from Hubble UI pods 🔍 Collecting platform-specific data 🔍 Collecting Hubble flows from Cilium pods ⚠️ The following tasks failed, the sysdump may be incomplete: ⚠️ [10] Collecting Cilium egress NAT policies: failed to collect Cilium egress NAT policies: the server could not find the requested resource (get ciliumegressnatpolicies.cilium.io) ⚠️ [11] Collecting Cilium local redirect policies: failed to collect Cilium local redirect policies: the server could not find the requested resource (get ciliumlocalredirectpolicies.cilium.io) ⚠️ Please note that depending on your Cilium version and installation options, this may be expected 🗳 Compiling sysdump ✅ The sysdump has been saved to /Users/yojeo/src/microservices-demo/cilium-sysdump-20220126-131409.zip

Relevant log output

No response

Anything else?

No response

Code of Conduct

lizrice commented 2 years ago

Thank you @youngjeong46. Looks like this changed in the Cilium CLI between v0.10.0 and v0.10.1

twpayne commented 2 years ago

@tklauser did some investigation and narrowed it down to https://github.com/cilium/cilium-cli/pull/660 accidentally dropping the functionality from https://github.com/cilium/cilium-cli/commit/a7670ef7264d9e1a28407c8918d03c53c1f72a56#diff-fb56ec6385ea3f0ef77e77ca81cbb2eb278f6f32ffdd5e0ca0c8ea6824b0b08a.

twpayne commented 2 years ago

https://github.com/cilium/cilium-cli/pull/694 should fix this.

twpayne commented 2 years ago

As an immediate workaround, you can pass the --relay-version to cilium hubble enable:

$ cilium hubble enable --relay-version -service-mesh:v1.11.0-beta.1
🔑 Found CA in secret cilium-ca
✨ Patching ConfigMap cilium-config to enable Hubble...
♻️  Restarted Cilium pods
⌛ Waiting for Cilium to become ready before deploying other Hubble component(s)...
🔑 Generating certificates for Relay...
✨ Deploying Relay from quay.io/cilium/hubble-relay-service-mesh:v1.11.0-beta.1...
⌛ Waiting for Hubble to be installed...
✅ Hubble was successfully enabled!
tklauser commented 2 years ago

This issue is fixed in cilium-cli v0.10.2 which we've just released: https://github.com/cilium/cilium-cli/releases/tag/v0.10.2