Closed kwonmha closed 2 years ago
I'm sorry to hear that you are facing issues with MicroK8s. With regards with knative and istio, i think it has something to do with kubeflow's istio components. The way the enable works, is it checks for certain pods to be present. Kubeflow includes istio components. This is something that needs to be fixed. As of the moment you can try to install knative from upstream. 🙁
Thx for reply. I tried to install istio with the yaml below as a prerequisite for knative, introduced by knative.
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
values:
global:
proxy:
autoInject: disabled
useMCP: false
# The third-party-jwt is not enabled on all k8s.
# See: https://istio.io/docs/ops/best-practices/security/#configure-third-party-service-account-tokens
jwtPolicy: first-party-jwt
addonComponents:
pilot:
enabled: true
components:
ingressGateways:
- name: istio-ingressgateway
enabled: true
And it says
(kubeflow) user01@user-X299-WU8:/data/mhkwon/kubeflow$ istioctl install -f istio-install.yaml
This will install the Istio default profile with ["Istio core" "Istiod" "Ingress gateways"] components into the cluster. Proceed? (y/N) y
✔ Istio core installed
✘ Istiod encountered an error: failed to wait for resource: resources not ready after 5m0s: timed out waiting for the condition
Deployment/istio-system/istiod
✘ Ingress gateways encountered an error: failed to wait for resource: resources not ready after 5m0s: timed out waiting for the conditiontiod
Deployment/istio-system/istio-ingressgateway
✘ Addons encountered an error: failed to wait for resource: resources not ready after 5m0s: timed out waiting for the condition
Deployment/istio-system/istiod
- Pruning removed resources Error: failed to install manifests: errors occurred during operation
Don't know much about istio and knative, do you mind sharing microk8s kubectl get pods -A -o wide
This is the output.
(base) user01@user-X299-WU8:/var/snap/microk8s/current/args$ microk8s kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default iris-training-4925-2sv2p 0/1 Completed 0 22h 10.1.225.135 user-x299-wu8 <none> <none>
kube-system nvidia-device-plugin-daemonset-fg2rw 0/1 CrashLoopBackOff 153 8d 10.1.225.86 user-x299-wu8 <none> <none>
kubeflow istio-pilot-57b4fc6d5c-8sq6h 0/1 Running 10 8d 10.1.225.123 user-x299-wu8 <none> <none>
kubeflow metadata-api-7b8b588ccf-mqk9q 0/1 Running 31 8d 10.1.225.106 user-x299-wu8 <none> <none>
kubeflow metadata-db-0 0/1 Running 10 8d 10.1.225.191 user-x299-wu8 <none> <none>
kubeflow pipelines-db-0 0/1 Running 10 8d 10.1.225.78 user-x299-wu8 <none> <none>
ingress nginx-ingress-microk8s-controller-7wmq7 0/1 Running 11 11d 10.1.225.85 user-x299-wu8 <none> <none>
kubeflow argo-ui-79997765b6-gq2c7 0/1 Running 10 8d 10.1.225.93 user-x299-wu8 <none> <none>
kube-system coredns-86f78bb79c-pj9tk 0/1 Running 10 11d 10.1.225.115 user-x299-wu8 <none> <none>
kubeflow katib-db-0 0/1 Running 10 8d 10.1.225.124 user-x299-wu8 <none> <none>
admin private-reg 0/1 Terminating 231 47h 10.1.225.82 user-x299-wu8 <none> <none>
kubeflow pipelines-api-operator-0 1/1 Running 11 8d 10.1.225.101 user-x299-wu8 <none> <none>
kubeflow katib-controller-operator-0 1/1 Running 11 8d 10.1.225.69 user-x299-wu8 <none> <none>
kubeflow kubeflow-dashboard-operator-0 1/1 Running 10 8d 10.1.225.97 user-x299-wu8 <none> <none>
kubeflow pipelines-visualization-54757695b5-slrhs 1/1 Running 10 8d 10.1.225.179 user-x299-wu8 <none> <none>
kubeflow argo-controller-operator-0 1/1 Running 11 8d 10.1.225.110 user-x299-wu8 <none> <none>
kubeflow seldon-core-operator-0 1/1 Running 10 8d 10.1.225.99 user-x299-wu8 <none> <none>
kubeflow pipelines-viewer-operator-0 1/1 Running 10 8d 10.1.225.68 user-x299-wu8 <none> <none>
kubeflow metadata-api-operator-0 1/1 Running 11 8d 10.1.225.102 user-x299-wu8 <none> <none>
kubeflow istio-pilot-operator-0 1/1 Running 11 8d 10.1.225.91 user-x299-wu8 <none> <none>
kubeflow oidc-gatekeeper-operator-0 1/1 Running 10 8d 10.1.225.83 user-x299-wu8 <none> <none>
kubeflow metadata-envoy-696c6bbdcf-klzr7 1/1 Running 10 8d 10.1.225.103 user-x299-wu8 <none> <none>
kubeflow minio-operator-0 1/1 Running 11 8d 10.1.225.183 user-x299-wu8 <none> <none>
kubeflow admission-webhook-7998c89c96-4wgb7 1/1 Running 10 8d 10.1.225.126 user-x299-wu8 <none> <none>
kubeflow katib-db-operator-0 1/1 Running 10 8d 10.1.225.162 user-x299-wu8 <none> <none>
kubeflow jupyter-controller-operator-0 1/1 Running 11 8d 10.1.225.130 user-x299-wu8 <none> <none>
kubeflow jupyter-web-operator-0 1/1 Running 11 8d 10.1.225.96 user-x299-wu8 <none> <none>
controller-uk8s controller-0 2/2 Running 22 8d 10.1.225.87 user-x299-wu8 <none> <none>
kube-system hostpath-provisioner-5c65fbdb4f-bd6sp 1/1 Running 12 11d 10.1.225.116 user-x299-wu8 <none> <none>
kubeflow argo-ui-operator-0 1/1 Running 12 8d 10.1.225.161 user-x299-wu8 <none> <none>
kubeflow pytorch-operator-8449b4ff65-rt82k 1/1 Running 11 8d 10.1.225.73 user-x299-wu8 <none> <none>
kubeflow jupyter-controller-7d4c954b9b-pv4km 1/1 Running 10 8d 10.1.225.109 user-x299-wu8 <none> <none>
kubeflow metadata-ui-87c584c46-fdbfd 2/2 Running 20 8d 10.1.225.79 user-x299-wu8 <none> <none>
kubeflow pipelines-api-c89969c98-wxwth 2/2 Running 42 8d 10.1.225.118 user-x299-wu8 <none> <none>
kubeflow istio-ingressgateway-operator-0 1/1 Running 12 8d 10.1.225.76 user-x299-wu8 <none> <none>
kubeflow pipelines-ui-operator-0 1/1 Running 10 8d 10.1.225.92 user-x299-wu8 <none> <none>
kubeflow katib-manager-5fd7bf6c56-47qvf 1/1 Running 12 8d 10.1.225.186 user-x299-wu8 <none> <none>
kubeflow katib-ui-7b9db945df-t8gdx 1/1 Running 10 8d 10.1.225.104 user-x299-wu8 <none> <none>
kubeflow istio-ingressgateway-7799df4bb4-26vc4 1/1 Running 10 8d 10.1.225.112 user-x299-wu8 <none> <none>
kubeflow kubeflow-dashboard-64dd996b5-2cdjb 2/2 Running 23 8d 10.1.225.72 user-x299-wu8 <none> <none>
kubeflow dex-auth-86796c8bf4-62vfq 2/2 Running 42 8d 10.1.225.77 user-x299-wu8 <none> <none>
kubeflow pipelines-persistence-operator-0 1/1 Running 10 8d 10.1.225.98 user-x299-wu8 <none> <none>
kube-system calico-kube-controllers-847c8c99d-bd5g4 1/1 Running 11 12d 10.1.225.80 user-x299-wu8 <none> <none>
kubeflow metadata-db-operator-0 1/1 Running 10 8d 10.1.225.84 user-x299-wu8 <none> <none>
controller-uk8s modeloperator-5d9757f556-vvdn5 1/1 Running 11 8d 10.1.225.65 user-x299-wu8 <none> <none>
kube-system metrics-server-8bbfb4bdb-ds9qh 1/1 Running 10 8d 10.1.225.75 user-x299-wu8 <none> <none>
kubeflow minio-0 1/1 Running 10 8d 10.1.225.185 user-x299-wu8 <none> <none>
kubeflow dex-auth-operator-0 1/1 Running 12 8d 10.1.225.100 user-x299-wu8 <none> <none>
kubeflow pipelines-visualization-operator-0 1/1 Running 10 8d 10.1.225.89 user-x299-wu8 <none> <none>
kubeflow pipelines-viewer-78f7dcb544-cf4bv 1/1 Running 10 8d 10.1.225.152 user-x299-wu8 <none> <none>
kubeflow jupyter-web-6987858ff5-7v9bf 1/1 Running 10 8d 10.1.225.67 user-x299-wu8 <none> <none>
kubeflow admission-webhook-operator-0 1/1 Running 10 8d 10.1.225.81 user-x299-wu8 <none> <none>
kubeflow katib-manager-operator-0 1/1 Running 10 8d 10.1.225.95 user-x299-wu8 <none> <none>
kubeflow kubeflow-profiles-694d9f9495-wql9b 2/2 Running 20 8d 10.1.225.137 user-x299-wu8 <none> <none>
kubeflow metadata-ui-operator-0 1/1 Running 11 8d 10.1.225.117 user-x299-wu8 <none> <none>
kubeflow kubeflow-profiles-operator-0 1/1 Running 10 8d 10.1.225.113 user-x299-wu8 <none> <none>
kubeflow katib-ui-operator-0 1/1 Running 10 8d 10.1.225.180 user-x299-wu8 <none> <none>
kubeflow seldon-core-656c7f7f4f-hkpwj 1/1 Running 12 8d 10.1.225.125 user-x299-wu8 <none> <none>
kubeflow metadata-grpc-65484b66bc-jnks6 1/1 Running 17 8d 10.1.225.74 user-x299-wu8 <none> <none>
kubeflow modeloperator-85dd445b8c-f2trq 1/1 Running 10 8d 10.1.225.182 user-x299-wu8 <none> <none>
kubeflow metadata-envoy-operator-0 1/1 Running 10 8d 10.1.225.174 user-x299-wu8 <none> <none>
kubeflow pipelines-persistence-5b7f97d785-v6w87 1/1 Running 14 8d 10.1.225.71 user-x299-wu8 <none> <none>
kubeflow pipelines-scheduledworkflow-5885f8b7b9-25mxd 1/1 Running 10 8d 10.1.225.105 user-x299-wu8 <none> <none>
kubeflow oidc-gatekeeper-6b8847b678-fzm2f 2/2 Running 20 8d 10.1.225.111 user-x299-wu8 <none> <none>
kubeflow tf-job-operator-operator-0 1/1 Running 10 8d 10.1.225.66 user-x299-wu8 <none> <none>
kubeflow katib-controller-78fbb9dc9d-4zpsg 1/1 Running 10 8d 10.1.225.122 user-x299-wu8 <none> <none>
kubeflow argo-controller-595dfb97b8-bh89q 1/1 Running 10 8d 10.1.225.119 user-x299-wu8 <none> <none>
kubeflow metadata-grpc-operator-0 1/1 Running 11 8d 10.1.225.127 user-x299-wu8 <none> <none>
kubeflow metacontroller-74b758df8-7n6hb 1/1 Running 10 8d 10.1.225.70 user-x299-wu8 <none> <none>
metallb-system controller-559b68bfd8-2z4nb 1/1 Running 11 11d 10.1.225.88 user-x299-wu8 <none> <none>
kubeflow pytorch-operator-operator-0 1/1 Running 11 8d 10.1.225.120 user-x299-wu8 <none> <none>
kubeflow pipelines-ui-5b85f985d-ngdzx 1/1 Running 10 8d 10.1.225.184 user-x299-wu8 <none> <none>
kube-system calico-node-89dbd 1/1 Running 16 12d 10.100.0.61 user-x299-wu8 <none> <none>
metallb-system speaker-gfnlx 1/1 Running 16 11d 10.100.0.61 user-x299-wu8 <none> <none>
kube-system dashboard-metrics-scraper-6c4568dc68-zbdhw 1/1 Running 12 8d 10.1.225.136 user-x299-wu8 <none> <none>
kubeflow metacontroller-operator-0 1/1 Running 11 8d 10.1.225.108 user-x299-wu8 <none> <none>
kubeflow tf-job-operator-79865dfbf4-wd5kn 1/1 Running 12 8d 10.1.225.121 user-x299-wu8 <none> <none>
kubeflow pipelines-db-operator-0 1/1 Running 10 8d 10.1.225.176 user-x299-wu8 <none> <none>
kubeflow pipelines-scheduledworkflow-operator-0 1/1 Running 11 8d 10.1.225.107 user-x299-wu8 <none> <none>
kube-system kubernetes-dashboard-7ffd448895-nhx8q 1/1 Running 12 8d 10.1.225.90 user-x299-wu8 <none> <none>
Many of your pods aren't running. Most importantly the coredns
.
Need to see what is wrong with coredns. What do you see when you do microk8s kubectl describe po coredns-86f78bb79c-pj9tk -n kube-system
. Also the logs of this pod.
What I see is
Name: coredns-86f78bb79c-pj9tk
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: user-x299-wu8/10.100.0.61
Start Time: Fri, 12 Mar 2021 15:19:29 +0900
Labels: k8s-app=kube-dns
pod-template-hash=86f78bb79c
Annotations: cni.projectcalico.org/podIP: 10.1.225.115/32
cni.projectcalico.org/podIPs: 10.1.225.115/32
scheduler.alpha.kubernetes.io/critical-pod:
Status: Running
IP: 10.1.225.115
IPs:
IP: 10.1.225.115
Controlled By: ReplicaSet/coredns-86f78bb79c
Containers:
coredns:
Container ID: containerd://914b58a114cd5a746dd04bc2e683b97f5397a7af121590bd968151afd87773a8
Image: coredns/coredns:1.6.6
Image ID: docker.io/coredns/coredns@sha256:41bee6992c2ed0f4628fcef75751048927bcd6b1cee89c79f6acb63ca5474d5a
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Running
Started: Tue, 23 Mar 2021 23:27:59 +0900
Last State: Terminated
Reason: Unknown
Exit Code: 255
Started: Tue, 23 Mar 2021 14:32:29 +0900
Finished: Tue, 23 Mar 2021 23:05:54 +0900
Ready: False
Restart Count: 10
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-2wd6j (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-2wd6j:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-2wd6j
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
This pod doesn't seem ready at all. Can you upload another inspect tarball?
Here it is. inspection-report-20210324_170107.tar.gz
I can't find the reason why coredns doesn't want to start. Have you restarted microk8s?
Update: the node is not in Ready
state
Gonna check why.
I see these in the logs.
==== START logs for container calico-kube-controllers of pod kube-system/calico-kube-controllers-847c8c99d-bd5g4 ====
Request log error: the server could not find the requested resource (get pods calico-kube-controllers-847c8c99d-bd5g4)
==== END logs for container calico-kube-controllers of pod kube-system/calico-kube-controllers-847c8c99d-bd5g4 ====
==== START logs for container metrics-server of pod kube-system/metrics-server-8bbfb4bdb-ds9qh ====
Request log error: the server could not find the requested resource (get pods metrics-server-8bbfb4bdb-ds9qh)
==== END logs for container metrics-server of pod kube-system/metrics-server-8bbfb4bdb-ds9qh ====
==== START logs for container upgrade-ipam of pod kube-system/calico-node-89dbd ====
Request log error: the server could not find the requested resource (get pods calico-node-89dbd)
==== END logs for container upgrade-ipam of pod kube-system/calico-node-89dbd ====
==== START logs for container install-cni of pod kube-system/calico-node-89dbd ====
Request log error: the server could not find the requested resource (get pods calico-node-89dbd)
==== END logs for container install-cni of pod kube-system/calico-node-89dbd ====
I also noticed that your hostname has capital letters. Is it possible to change the hostname to use all small letters?
You can follow the instructions from the MicroK8s common issues section. https://microk8s.io/docs/troubleshooting#heading--common-issues on how to remediate this. You may have to restart MicroK8s.
In the Api server logs I also see this:
3월 24 16:09:26 user-X299-WU8 microk8s.daemon-apiserver[28351]: W0324 16:09:26.965240 28351 dispatcher.go:170] Failed calling webhook, failing open admission.juju.is: failed calling webhook "admission.juju.is": Post "https://modeloperator.kubeflow.svc:17071/k8s/admission/fd8e2ea1-14ce-4661-878b-2db12f81b868?timeout=30s": dial tcp 10.152.183.250:17071: connect: connection refused
This reminds me of issue: https://github.com/ubuntu/microk8s/issues/1520 but I am not sure about the connection.
@knkski any idea what that might be?
@ktsakalozos coredns isn't running, that could be the reason why.
The node is also in NotReady
state. But calico is reported to be up.
@balchua
Changed machine name with the --hostname-override
argument.
And restarted microk8s.
Hi is finance-ai-test
network resolvable by the node?
@kwonmha: This may be due to the issue that is fixed in #2086. The easiest way to fix this if that is the case is to run:
sudo snap remove microk8s --purge && sudo snap install microk8s --classic
Otherwise, if you don't want to reinstall MicroK8s, you can manually run the kubectl delete
commands contained in that PR to clean up the webhook.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I can't enable knative. It waits for istio to be ready but istio doesn't seem to be ready. I tried to disable and enable istio again but it gives another error.
This is the tarball. inspection-report-20210323_204727.tar.gz
I'm facing so many bugs on microk8s...