open-cluster-management-io / multicloud-operators-subscription

Enables multicluster application delivery.
https://open-cluster-management.io/getting-started/integration/app-lifecycle/
Apache License 2.0
44 stars 38 forks source link

How to debug(maybe worthy a trouble shooting doc) on Application lifecycle management #130

Open jichenjc opened 2 years ago

jichenjc commented 2 years ago

Follow https://open-cluster-management.io/getting-started/integration/app-lifecycle/

and now at last step

kubectl get subscriptions.apps --context ${CTX_MANAGED_CLUSTER}

but nothing show out

# kubectl get subscriptions.apps --context  kind-cluster1
No resources found in default namespace.

checked the pod status seems fine, logs show some suspect issue but not sure it's root cause..

# kubectl get pods -A --context kind-hub
NAMESPACE                     NAME                                                       READY   STATUS    RESTARTS   AGE
kube-system                   coredns-74ff55c5b-85lp5                                    1/1     Running   0          29m
kube-system                   coredns-74ff55c5b-wtzc4                                    1/1     Running   0          29m
kube-system                   etcd-hub-control-plane                                     1/1     Running   0          29m
kube-system                   kindnet-lj9tm                                              1/1     Running   0          29m
kube-system                   kube-apiserver-hub-control-plane                           1/1     Running   0          29m
kube-system                   kube-controller-manager-hub-control-plane                  1/1     Running   0          29m
kube-system                   kube-proxy-7c2xx                                           1/1     Running   0          29m
kube-system                   kube-scheduler-hub-control-plane                           1/1     Running   0          29m
local-path-storage            local-path-provisioner-78776bfc44-qx7d8                    1/1     Running   0          29m
open-cluster-management-hub   cluster-manager-placement-controller-c4bc6cbd8-dh7md       1/1     Running   2          28m
open-cluster-management-hub   cluster-manager-registration-controller-76c4ffd996-fmj5c   1/1     Running   0          28m
open-cluster-management-hub   cluster-manager-registration-webhook-dcc694f68-tx2fb       1/1     Running   0          28m
open-cluster-management-hub   cluster-manager-work-webhook-65f7565896-gt6kl              1/1     Running   0          28m
open-cluster-management       cluster-manager-5d85887bb5-g9h4x                           1/1     Running   0          28m
open-cluster-management       multicluster-operators-appsub-summary-7589c9f5b4-gtvtx     1/1     Running   0          19m
open-cluster-management       multicluster-operators-channel-89ddbdbdf-z29gg             1/1     Running   0          19m
open-cluster-management       multicluster-operators-placementrule-7d4f6ddfcb-ss988      1/1     Running   0          19m
open-cluster-management       multicluster-operators-subscription-87f4fc96b-8ftlk        1/1     Running   1          17m

# kubectl logs multicluster-operators-channel-89ddbdbdf-z29gg -n open-cluster-management --context kind-hub

{"level":"info","ts":"2022-03-21T09:13:21.086Z","logger":"controllers.channel","caller":"channel/channel_controller.go:188","msg":"Starting channel reconcile loop for dev/dev-helmrepo","channel-reconcile":"dev/dev-helmrepo"}
{"level":"info","ts":"2022-03-21T09:13:21.088Z","logger":"controllers.channel","caller":"channel/channel_controller.go:563","msg":"No MultiClusterHub Resource found","channel-reconcile":"dev/dev-helmrepo"}
{"level":"info","ts":"2022-03-21T09:13:21.297Z","logger":"controllers.channel","caller":"channel/channel_controller.go:527","msg":"The channel dev/dev-helmrepo is not in the ACM Namespace , skipping...","channel-reconcile":"dev/dev-helmrepo"}
{"level":"info","ts":"2022-03-21T09:13:21.297Z","logger":"controllers.channel","caller":"channel/channel_controller.go:244","msg":"Finish channel reconcile loop for dev/dev-helmrepo","channel-reconcile":"dev/dev-helmrepo"}
mikeshng commented 2 years ago

We have some doc here:

https://github.com/open-cluster-management-io/multicloud-operators-subscription/blob/main/docs/troubleshooting_guidence.md

I suggest first checking the log of multicluster-operators-subscription- pod on the hub side then check the application-manager pod on the managed cluster side.

I will also try to reproduce this issue.

mikeshng commented 2 years ago
$ kubectl  get appsub -o yaml
...
  status:
    lastUpdateTime: "2022-03-21T14:18:23Z"
    phase: PropagationFailed
    reason: no matches for kind "PlacementDecision" in version "cluster.open-cluster-management.io/v1beta1"

The PlacementDecision API is provided by the clusteradm tool while installing the foundational components of OCM. It seems like clusteradm hasn't been updated to install the new version of the API. The workaround for now is:

kubectl apply -f hack/test/placementdecisions.crd.yaml
jichenjc commented 2 years ago

ok, it's very weird , I didn't do anything but the subscriptions.apps is out

however, it's around 20 hours since I post the question but the age of the subscriptions.apps is 3h which is really weird ..

# kubectl get subscriptions.apps --context  kind-cluster1
NAME        STATUS       AGE     LOCAL PLACEMENT   TIME WINDOW
nginx-sub   Subscribed   3h10m   true

updated:

the pod is also 3h+

# kubectl get pod --context kind-cluster1
NAME                                                   READY   STATUS    RESTARTS   AGE
nginx-ingress-2e6e1-controller-7ff9cdcd47-c6znp        1/1     Running   0          3h33m
nginx-ingress-2e6e1-default-backend-6686b997db-4zr5r   1/1     Running   0          3h33m
jichenjc commented 2 years ago

I will try this again and see whether same pattern apply or not ..