vmware-tanzu / tanzu-framework

Tanzu Framework provides a set of building blocks to build atop of the Tanzu platform and leverages Carvel packaging and plugins to provide users with a much stronger, more integrated experience than the loose coupling and stand-alone commands of the previous generation of tools.
Apache License 2.0
196 stars 193 forks source link

ClusterBootstrapController reconcile on error happens after 10 min #2233

Open vijaykatam opened 2 years ago

vijaykatam commented 2 years ago

Bug description ClusterBootstrapController reconciles packages on workload cluster. I am seeing that after an error where kapp-controller is still coming up the next round of reconcile happens after 10 min

2:6443/api?timeout=10s\": dial tcp 192.168.120.2:6443: connect: connection refused" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" 
E0425 21:11:11.668876       1 controller.go:317] controller/cluster "msg"="Reconciler error" "error"="error creating dynamic rest mapper for remote cluster \"gctest/clusterclass-3\": Get \"https://192.168.120.2:6443/api?timeout=10s\": dial tcp 192.168.120.2:6443: connect: connection refused" "name"="clusterclass-3" "namespace"="gctest" "reconciler group"="cluster.x-k8s.io" "reconciler kind"="Cluster" 
I0425 21:11:36.718779       1 clusterbootstrap_controller.go:171] ClusterBootstrapController "msg"="Reconciling cluster" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" 
I0425 21:11:36.720005       1 clusterbootstrap_controller.go:1529] ClusterBootstrapController "msg"="setting proxy and network configurations in Cluster annotation" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" "tkg.tanzu.vmware.com/skip-tls-verify"="" "tkg.tanzu.vmware.com/tkg-http-proxy"="" "tkg.tanzu.vmware.com/tkg-https-proxy"="" "tkg.tanzu.vmware.com/tkg-ip-family"="" "tkg.tanzu.vmware.com/tkg-no-proxy"="" "tkg.tanzu.vmware.com/tkg-proxy-ca-cert"=""
I0425 21:11:36.838596       1 clusterbootstrap_controller.go:586] ClusterBootstrapController "msg"="created or patched the PackageInstall gctest/clusterclass-3-kapp-controller for cluster clusterclass-3"  
I0425 21:11:36.912460       1 clusterbootstrap_controller.go:695] ClusterBootstrapController "msg"="created namespace"  "namespace"="vmware-system-tkg"
I0425 21:11:36.912717       1 clusterbootstrap_controller.go:737] ClusterBootstrapController "msg"="creating or patching ServiceAccount vmware-system-tkg/tanzu-cluster-bootstrap-sa on cluster gctest/clusterclass-3"  
I0425 21:11:36.962509       1 clusterbootstrap_controller.go:783] ClusterBootstrapController "msg"="created or patched ClusterRole /tanzu-cluster-bootstrap-clusterrole on cluster gctest/clusterclass-3"  
E0425 21:11:37.069146       1 clusterbootstrap_controller.go:618] ClusterBootstrapController "msg"="unable to create or patch Package resource vmware-system-tkg/antrea.tanzu.vmware.com.0.11.3+vmware.2-tkg.2 on cluster: gctest/clusterclass-3" "error"="no matches for kind \"Package\" in version \"data.packaging.carvel.dev/v1alpha1\""  
E0425 21:11:37.069327       1 clusterbootstrap_controller.go:239] ClusterBootstrapController "msg"="unable to create or patch all the required resources for antrea.tanzu.vmware.com.0.11.3+vmware.2-tkg.2 on cluster: gctest/clusterclass-3" "error"="no matches for kind \"Package\" in version \"data.packaging.carvel.dev/v1alpha1\"" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" 
E0425 21:11:37.069428       1 controller.go:317] controller/cluster "msg"="Reconciler error" "error"="no matches for kind \"Package\" in version \"data.packaging.carvel.dev/v1alpha1\"" "name"="clusterclass-3" "namespace"="gctest" "reconciler group"="cluster.x-k8s.io" "reconciler kind"="Cluster" 
I0425 21:11:46.341270       1 clusterbootstrap_controller.go:171] ClusterBootstrapController "msg"="Reconciling cluster" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" 
I0425 21:11:46.342631       1 clusterbootstrap_controller.go:1529] ClusterBootstrapController "msg"="setting proxy and network configurations in Cluster annotation" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" "tkg.tanzu.vmware.com/skip-tls-verify"="" "tkg.tanzu.vmware.com/tkg-http-proxy"="" "tkg.tanzu.vmware.com/tkg-https-proxy"="" "tkg.tanzu.vmware.com/tkg-ip-family"="" "tkg.tanzu.vmware.com/tkg-no-proxy"="" "tkg.tanzu.vmware.com/tkg-proxy-ca-cert"=""
I0425 21:11:46.404268       1 clusterbootstrap_controller.go:586] ClusterBootstrapController "msg"="created or patched the PackageInstall gctest/clusterclass-3-kapp-controller for cluster clusterclass-3"  
I0425 21:11:46.409213       1 clusterbootstrap_controller.go:737] ClusterBootstrapController "msg"="creating or patching ServiceAccount vmware-system-tkg/tanzu-cluster-bootstrap-sa on cluster gctest/clusterclass-3"  
I0425 21:11:46.476926       1 clusterbootstrap_controller.go:783] ClusterBootstrapController "msg"="created or patched ClusterRole /tanzu-cluster-bootstrap-clusterrole on cluster gctest/clusterclass-3"  
E0425 21:11:46.669497       1 clusterbootstrap_controller.go:618] ClusterBootstrapController "msg"="unable to create or patch Package resource vmware-system-tkg/antrea.tanzu.vmware.com.0.11.3+vmware.2-tkg.2 on cluster: gctest/clusterclass-3" "error"="no matches for kind \"Package\" in version \"data.packaging.carvel.dev/v1alpha1\""  
E0425 21:11:46.669560       1 clusterbootstrap_controller.go:239] ClusterBootstrapController "msg"="unable to create or patch all the required resources for antrea.tanzu.vmware.com.0.11.3+vmware.2-tkg.2 on cluster: gctest/clusterclass-3" "error"="no matches for kind \"Package\" in version \"data.packaging.carvel.dev/v1alpha1\"" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" 
E0425 21:11:46.669646       1 controller.go:317] controller/cluster "msg"="Reconciler error" "error"="no matches for kind \"Package\" in version \"data.packaging.carvel.dev/v1alpha1\"" "name"="clusterclass-3" "namespace"="gctest" "reconciler group"="cluster.x-k8s.io" "reconciler kind"="Cluster" 
I0425 21:11:50.390358       1 clusterbootstrap_controller.go:171] ClusterBootstrapController "msg"="Reconciling cluster" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" 
I0425 21:11:50.390884       1 clusterbootstrap_controller.go:1529] ClusterBootstrapController "msg"="setting proxy and network configurations in Cluster annotation" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" "tkg.tanzu.vmware.com/skip-tls-verify"="" "tkg.tanzu.vmware.com/tkg-http-proxy"="" "tkg.tanzu.vmware.com/tkg-https-proxy"="" "tkg.tanzu.vmware.com/tkg-ip-family"="" "tkg.tanzu.vmware.com/tkg-no-proxy"="" "tkg.tanzu.vmware.com/tkg-proxy-ca-cert"=""
I0425 21:11:50.703974       1 clusterbootstrap_controller.go:586] ClusterBootstrapController "msg"="created or patched the PackageInstall gctest/clusterclass-3-kapp-controller for cluster clusterclass-3"  
I0425 21:11:50.711438       1 clusterbootstrap_controller.go:737] ClusterBootstrapController "msg"="creating or patching ServiceAccount vmware-system-tkg/tanzu-cluster-bootstrap-sa on cluster gctest/clusterclass-3"  
I0425 21:11:50.758395       1 clusterbootstrap_controller.go:783] ClusterBootstrapController "msg"="created or patched ClusterRole /tanzu-cluster-bootstrap-clusterrole on cluster gctest/clusterclass-3"  
E0425 21:11:50.869506       1 clusterbootstrap_controller.go:618] ClusterBootstrapController "msg"="unable to create or patch Package resource vmware-system-tkg/antrea.tanzu.vmware.com.0.11.3+vmware.2-tkg.2 on cluster: gctest/clusterclass-3" "error"="no matches for kind \"Package\" in version \"data.packaging.carvel.dev/v1alpha1\""  
E0425 21:11:50.869644       1 clusterbootstrap_controller.go:239] ClusterBootstrapController "msg"="unable to create or patch all the required resources for antrea.tanzu.vmware.com.0.11.3+vmware.2-tkg.2 on cluster: gctest/clusterclass-3" "error"="no matches for kind \"Package\" in version \"data.packaging.carvel.dev/v1alpha1\"" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" 
E0425 21:11:50.869815       1 controller.go:317] controller/cluster "msg"="Reconciler error" "error"="no matches for kind \"Package\" in version \"data.packaging.carvel.dev/v1alpha1\"" "name"="clusterclass-3" "namespace"="gctest" "reconciler group"="cluster.x-k8s.io" "reconciler kind"="Cluster" 
I0425 21:21:08.496641       1 clusterbootstrap_controller.go:171] ClusterBootstrapController "msg"="Reconciling cluster" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" 
I0425 21:21:08.503637       1 clusterbootstrap_controller.go:1529] ClusterBootstrapController "msg"="setting proxy and network configurations in Cluster annotation" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" "tkg.tanzu.vmware.com/skip-tls-verify"="" "tkg.tanzu.vmware.com/tkg-http-proxy"="" "tkg.tanzu.vmware.com/tkg-https-proxy"="" "tkg.tanzu.vmware.com/tkg-ip-family"="" "tkg.tanzu.vmware.com/tkg-no-proxy"="" "tkg.tanzu.vmware.com/tkg-proxy-ca-cert"=""
I0425 21:21:09.816867       1 request.go:665] Waited for 1.045263735s due to client-side throttling, not priority and fairness, request: GET:https://10.80.0.1:443/apis/imagecontroller.vmware.com/v1
I0425 21:21:11.824099       1 clusterbootstrap_controller.go:586] ClusterBootstrapController "msg"="created or patched the PackageInstall gctest/clusterclass-3-kapp-controller for cluster clusterclass-3"  
I0425 21:21:11.828240       1 clusterbootstrap_controller.go:737] ClusterBootstrapController "msg"="creating or patching ServiceAccount vmware-system-tkg/tanzu-cluster-bootstrap-sa on cluster gctest/clusterclass-3"  
I0425 21:21:11.838399       1 clusterbootstrap_controller.go:783] ClusterBootstrapController "msg"="created or patched ClusterRole /tanzu-cluster-bootstrap-clusterrole on cluster gctest/clusterclass-3"  
I0425 21:21:11.935092       1 clusterbootstrap_controller.go:827] ClusterBootstrapController "msg"="created the Package CR antrea.tanzu.vmware.com.0.11.3+vmware.2-tkg.2 on cluster gctest/clusterclass-3"  
I0425 21:21:11.993243       1 clusterbootstrap_controller.go:839] ClusterBootstrapController "msg"="created or patched secret vmware-system-tkg/clusterclass-3-antrea.tanzu.vmware.com-data-values for package antrea.tanzu.vmware.com.0.11.3+vmware.2-tkg.2 on cluster gctest/clusterclass-3"  
I0425 21:21:12.045413       1 clusterbootstrap_controller.go:847] ClusterBootstrapController "msg"="created or patched the PackageInstall CR vmware-system-tkg/clusterclass-3-antrea on cluster gctest/clusterclass-3"  
I0425 21:21:12.141291       1 clusterbootstrap_controller.go:827] ClusterBootstrapController "msg"="created the Package CR vsphere-cpi.community.tanzu.vmware.com.1.23.0 on cluster gctest/clusterclass-3"  
I0425 21:21:13.207239       1 clusterbootstrap_controller.go:885] ClusterBootstrapController "msg"="patched the secret gctest/clusterclass-3-vsphere-cpi-data-values with package and cluster labels"  
I0425 21:21:13.280198       1 clusterbootstrap_controller.go:839] ClusterBootstrapController "msg"="created or patched secret vmware-system-tkg/clusterclass-3-vsphere-cpi.community.tanzu.vmware.com-data-values for package vsphere-cpi.community.tanzu.vmware.com.1.23.0 on cluster gctest/clusterclass-3"  
I0425 21:21:13.303316       1 clusterbootstrap_controller.go:847] ClusterBootstrapController "msg"="created or patched the PackageInstall CR vmware-system-tkg/clusterclass-3-vsphere-cpi on cluster gctest/clusterclass-3"  
I0425 21:21:13.397069       1 clusterbootstrap_controller.go:827] ClusterBootstrapController "msg"="created the Package CR secretgen-controller.tanzu.vmware.com.0.7.1+vmware.1-tkg.1 on cluster gctest/clusterclass-3"  
I0425 21:21:13.397148       1 clusterbootstrap_controller.go:1402] ClusterBootstrapController "msg"="no data values are provided to the ClusterBootstrapPackage.ValuesFrom field. ClusterBootstrapPackage.RefName: secretgen-controller.tanzu.vmware.com.0.7.1+vmware.1-tkg.1"  
I0425 21:21:13.397161       1 clusterbootstrap_controller.go:863] ClusterBootstrapController "msg"="no data values secret is needed for ClusterBootstrapPackage: secretgen-controller.tanzu.vmware.com.0.7.1+vmware.1-tkg.1, nothing to be created or patched on cluster gctest/clusterclass-3"  
I0425 21:21:13.444570       1 clusterbootstrap_controller.go:847] ClusterBootstrapController "msg"="created or patched the PackageInstall CR vmware-system-tkg/clusterclass-3-secretgen-controller on cluster gctest/clusterclass-3"  
I0425 21:21:13.534552       1 clusterbootstrap_controller.go:827] ClusterBootstrapController "msg"="created the Package CR metrics-server.tanzu.vmware.com.0.5.1+vmware.1-tkg.1 on cluster gctest/clusterclass-3"  
I0425 21:21:13.534580       1 clusterbootstrap_controller.go:1402] ClusterBootstrapController "msg"="no data values are provided to the ClusterBootstrapPackage.ValuesFrom field. ClusterBootstrapPackage.RefName: metrics-server.tanzu.vmware.com.0.5.1+vmware.1-tkg.1"  
I0425 21:21:13.534604       1 clusterbootstrap_controller.go:863] ClusterBootstrapController "msg"="no data values secret is needed for ClusterBootstrapPackage: metrics-server.tanzu.vmware.com.0.5.1+vmware.1-tkg.1, nothing to be created or patched on cluster gctest/clusterclass-3"  
I0425 21:21:13.583313       1 clusterbootstrap_controller.go:847] ClusterBootstrapController "msg"="created or patched the PackageInstall CR vmware-system-tkg/clusterclass-3-metrics-server on cluster gctest/clusterclass-3"  
I0425 21:21:13.583680       1 clusterbootstrap_controller.go:171] ClusterBootstrapController "msg"="Reconciling cluster" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" 
I0425 21:21:13.585279       1 clusterbootstrap_controller.go:1529] ClusterBootstrapController "msg"="setting proxy and network configurations in Cluster annotation" "cluster-name"="clusterclass-3" "cluster-ns"="gctest" "tkg.tanzu.vmware.com/skip-tls-verify"="" "tkg.tanzu.vmware.com/tkg-http-proxy"="" "tkg.tanzu.vmware.com/tkg-https-proxy"="" "tkg.tanzu.vmware.com/tkg-ip-family"="" "tkg.tanzu.vmware.com/tkg-no-proxy"="" "tkg.tanzu.vmware.com/tkg-proxy-ca-cert"=""
I0425 21:21:13.795638       1 clusterbootstrap_controller.go:586] ClusterBootstrapController "msg"="created or patched the PackageInstall gctest/clusterclass-3-kapp-controller for cluster clusterclass-3"  
I0425 21:21:13.872801       1 clusterbootstrap_controller.go:737] ClusterBootstrapController "msg"="creating or patching ServiceAccount vmware-system-tkg/tanzu-cluster-bootstrap-sa on cluster gctest/clusterclass-3"  
I0425 21:21:13.882052       1 clusterbootstrap_controller.go:783] ClusterBootstrapController "msg"="created or patched ClusterRole /tanzu-cluster-bootstrap-clusterrole on cluster gctest/clusterclass-3"  
I0425 21:21:14.048499       1 clusterbootstrap_controller.go:827] ClusterBootstrapController "msg"="created the Package CR antrea.tanzu.vmware.com.0.11.3+vmware.2-tkg.2 on cluster gctest/clusterclass-3"  
I0425 21:21:14.113817       1 clusterbootstrap_controller.go:839] ClusterBootstrapController "msg"="created or patched secret vmware-system-tkg/clusterclass-3-antrea.tanzu.vmware.com-data-values for package antrea.tanzu.vmware.com.0.11.3+vmware.2-tkg.2 on cluster gctest/clusterclass-3"  
I0425 21:21:14.243194       1 clusterbootstrap_controller.go:847] ClusterBootstrapController "msg"="created or patched the PackageInstall CR vmware-system-tkg/clusterclass-3-antrea on cluster gctest/clusterclass-3"  
I0425 21:21:14.339028       1 clusterbootstrap_controller.go:827] ClusterBootstrapController "msg"="created the Package CR vsphere-cpi.community.tanzu.vmware.com.1.23.0 on cluster gctest/clusterclass-3"  
I0425 21:21:14.375288       1 clusterbootstrap_controller.go:839] ClusterBootstrapController "msg"="created or patched secret vmware-system-tkg/clusterclass-3-vsphere-cpi.community.tanzu.vmware.com-data-values for package vsphere-cpi.community.tanzu.vmware.com.1.23.0 on cluster gctest/clusterclass-3"  
I0425 21:21:14.443612       1 clusterbootstrap_controller.go:847] ClusterBootstrapController "msg"="created or patched the PackageInstall CR vmware-system-tkg/clusterclass-3-vsphere-cpi on cluster gctest/clusterclass-3"  
I0425 21:21:14.540548       1 clusterbootstrap_controller.go:827] ClusterBootstrapController "msg"="created the Package CR secretgen-controller.tanzu.vmware.com.0.7.1+vmware.1-tkg.1 on cluster gctest/clusterclass-3"  
I0425 21:21:14.540648       1 clusterbootstrap_controller.go:1402] ClusterBootstrapController "msg"="no data values are provided to the ClusterBootstrapPackage.ValuesFrom field. ClusterBootstrapPackage.RefName: secretgen-controller.tanzu.vmware.com.0.7.1+vmware.1-tkg.1"  
I0425 21:21:14.540667       1 clusterbootstrap_controller.go:863] ClusterBootstrapController "msg"="no data values secret is needed for ClusterBootstrapPackage: secretgen-controller.tanzu.vmware.com.0.7.1+vmware.1-tkg.1, nothing to be created or patched on cluster gctest/clusterclass-3"  
I0425 21:21:14.645687       1 clusterbootstrap_controller.go:847] ClusterBootstrapController "msg"="created or patched the PackageInstall CR vmware-system-tkg/clusterclass-3-secretgen-controller on cluster gctest/clusterclass-3"  
I0425 21:21:14.690268       1 clusterbootstrap_controller.go:827] ClusterBootstrapController "msg"="created the Package CR metrics-server.tanzu.vmware.com.0.5.1+vmware.1-tkg.1 on cluster gctest/clusterclass-3"  
I0425 21:21:14.690378       1 clusterbootstrap_controller.go:1402] ClusterBootstrapController "msg"="no data values are provided to the ClusterBootstrapPackage.ValuesFrom field. ClusterBootstrapPackage.RefName: metrics-server.tanzu.vmware.com.0.5.1+vmware.1-tkg.1"  
I0425 21:21:14.690398       1 clusterbootstrap_controller.go:863] ClusterBootstrapController "msg"="no data values secret is needed for ClusterBootstrapPackage: metrics-server.tanzu.vmware.com.0.5.1+vmware.1-tkg.1, nothing to be created or patched on cluster gctest/clusterclass-3"  
I0425 21:21:14.744469       1 clusterbootstrap_controller.go:847] ClusterBootstrapController "msg"="created or patched the PackageInstall CR vmware-system-tkg/clusterclass-3-metrics-server on cluster gctest/clusterclass-3"  
I0425 21:21:26.653741       1 antreaconfig_controller.go:50] AntreaConfigController "msg"="Start reconciliation"  
I0425 21:21:26.655155       1 antreaconfig_controller.go:212] AntreaConfigController "msg"="Resource antrea data values secret unchanged" "antreaconfig"={"Namespace":"gctest","Name":"clusterclass-3-antrea.tanzu.vmware.com-package"} 
I0425 21:21:26.655191       1 antreaconfig_controller.go:134] AntreaConfigController "msg"="Successfully reconciled AntreaConfig" "antreaconfig"={"Namespace":"gctest","Name":"clusterclass-3-antrea.tanzu.vmware.com-package"} 
I0425 21:21:26.655233       1 antreaconfig_controller.go:116] AntreaConfigController "msg"="Patching AntreaConfig" "antreaconfig"={"Namespace":"gctest","Name":"clusterclass-3-antrea.tanzu.vmware.com-package"} 
I0425 21:21:26.656396       1 antreaconfig_controller.go:121] AntreaConfigController "msg"="Successfully patched AntreaConfig" "antreaconfig"={"Namespace":"gctest","Name":"clusterclass-3-antrea.tanzu.vmware.com-package"} 
I0425 21:21:26.656547       1 antreaconfig_controller.go:50] AntreaConfigController "msg"="Start reconciliation"  

Affected product area (please put an X in all that apply)

Expected behavior

Should continue to reconcile after a few seconds instead of 10 min

adduarte commented 2 years ago

Looking at the code, the error is returned here Which in turn returns the error (and reschedules) here

The requeue timeout is set here to return ctrl.Result{RequeueAfter: constants.RequeueAfterDuration}

constants.RequeueAfterDuration is set to 10 seconds

This would suggest something else is holding the reconciler.

adduarte commented 2 years ago

Needs to be reproduced.