kubernetes-sigs / cluster-api-provider-nested

Cluster API Provider for Nested Clusters
Apache License 2.0
298 stars 65 forks source link

🐛 Fix syncer panic when vc lister occur error #324

Closed wondywang closed 1 year ago

wondywang commented 1 year ago

As a developer, I got the syncer panic today. The output look something like this:

E1013 03:34:17.540825       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 566 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x197d200, 0x2dc1670})
        /workspace/source/src/sigs.k8s.io/cluster-api-provider-nested/virtualcluster/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x85
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc01eff6590})
        /workspace/source/src/sigs.k8s.io/cluster-api-provider-nested/virtualcluster/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x75
panic({0x197d200, 0x2dc1670})
        /usr/local/go/src/runtime/panic.go:1038 +0x215
sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/util/cluster.(*Cluster).Stop(0x1960880)
        /workspace/source/src/sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/util/cluster/cluster.go:362 +0x19
sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/syncer.(*Syncer).removeCluster(0xc00070db00, {0xc01e6d2be8, 0x13})
        /workspace/source/src/sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/syncer/syncer.go:337 +0x1d4
sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/syncer.(*Syncer).syncVirtualCluster(0xc00070db00, {0xc01e6d2be8, 0x13})
        /workspace/source/src/sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/syncer/syncer.go:308 +0x225
sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/syncer.(*Syncer).processNextWorkItem(0xc00070db00)
        /workspace/source/src/sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/syncer/syncer.go:284 +0xf6
sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/syncer.(*Syncer).run(...)
        /workspace/source/src/sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/syncer/syncer.go:273

https://github.com/kubernetes-sigs/cluster-api-provider-nested/blob/main/virtualcluster/pkg/syncer/syncer.go#L305

func (s *Syncer) syncVirtualCluster(key string) error {
        // snip
        vc, err := s.lister.VirtualClusters(namespace).Get(name)
        if err != nil {
                if !apierrors.IsNotFound(err) {
                        return err
                }

                // panic here! removeCluster->vc.Stop(), when get the vc not ready.
                s.removeCluster(key)
                return nil
        }

        switch vc.Status.Phase {
        case v1alpha1.ClusterRunning:
                return s.addCluster(key, vc)
        case v1alpha1.ClusterError:
                s.removeCluster(key)
                return nil
        default:
                klog.Infof("Cluster %s/%s not ready to reconcile", vc.Namespace, vc.Name)
                return nil
        }
}

What type of PR is this? /kind bug

k8s-ci-robot commented 1 year ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wondywang Once this PR has been reviewed and has the lgtm label, please assign christopherhein for approval by writing /assign @christopherhein in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[virtualcluster/OWNERS](https://github.com/kubernetes-sigs/cluster-api-provider-nested/blob/main/virtualcluster/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
k8s-ci-robot commented 1 year ago

Hi @wondywang. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.