Closed hamzy closed 3 weeks ago
I think currently we check public loadbalancer and set hostname accordingly, May be we need to look for all the configured loadbalancers status before setting infra as ready. https://github.com/kubernetes-sigs/cluster-api-provider-ibmcloud/blob/main/controllers/ibmpowervscluster_controller.go#L334-L346.
@Karthik-K-N can someone fix this behaviour asap?
@Karthik-K-N can someone fix this behaviour asap?
sure.
I think currently we check public loadbalancer and set hostname accordingly, May be we need to look for all the configured loadbalancers status before setting infra as ready. https://github.com/kubernetes-sigs/cluster-api-provider-ibmcloud/blob/main/controllers/ibmpowervscluster_controller.go#L334-L346.
@dharaneeshvrd Just looking at the code we already check all the loadbalancer status here in ReconcileLoadbalancer https://github.com/Karthik-K-N/cluster-api-provider-ibmcloud/blob/8a563a0620f34ecb772a73bccb9ed7f1384c822f/cloud/scope/powervs_cluster.go#L1977-L1979,
when do you think this issue will occur.
We did some investigation on the code and its working as expected
@hamzy are you running with latest code from this repo? If you still face issues with latest code, Please help us with controller logs for further debugging. Thanks
/triage needs-information
We are running version sigs.k8s.io/cluster-api-provider-ibmcloud v0.9.0-alpha.0.0.20240913094112-c6bcd313bce0
https://github.com/openshift/installer/blob/master/cluster-api/providers/ibmcloud/go.mod#L7
Version sigs.k8s.io/cluster-api-provider-ibmcloud v0.9.0-beta.0.0.20241017140904-8a563a0620f3 in https://github.com/openshift/installer/pull/9118 also fails.
Could you please share the error and the code reference.
This is a failing run of #9118 https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_installer/9118/pull-ci-openshift-installer-master-e2e-powervs-capi-ovn/1849180644286926848/artifacts/e2e-powervs-capi-ovn/ipi-install-powervs-install/artifacts/clusterapi_output/
Search for "InfraReady: hostname =" to see there is only one LB active at the time although the other one does become active eventually.
@Karthik-K-N @hamzy let us setup a call on Monday and sort out this issue.
InfraReady: hostname
Thanks for the reference. I will check and update more here. Looking at IBMPowerVSCluster resource from here. Seems like something is wrong, The LB is create_pending but Loadbalancer is set ready in conditions.
FYI I pass in two LBs here: https://github.com/openshift/installer/blob/master/pkg/asset/manifests/powervs/cluster.go#L135-L162
So why under status, does it only have one?
loadbalancers:
p-mad02-2-capi-master-qwb48-loadbalancer:
id: r050-70cc8d92-60fa-4ed1-9c09-11f0dd4d3d6a
state: create_pending
hostname: 70cc8d92-eu-es.lb.appdomain.cloud
controllercreated: true
@Karthik-K-N @hamzy let us setup a call on Monday and sort out this issue.
Sure, we could but it seems like the discussion on this PR seems to be sufficient?
I wrote a test PR: https://github.com/openshift/installer/pull/9145
shows
loadbalancers:
p-mad02-1-capi-master-pw5lj-loadbalancer:
id: r050-50fb5ff1-a7e0-432d-bc1c-e4848ecd461b
state: active
hostname: 50fb5ff1-eu-es.lb.appdomain.cloud
controllercreated: true
p-mad02-1-capi-master-pw5lj-loadbalancer-int:
id: r050-92f04537-ce19-4bce-9e8c-ab84c4632c4b
state: active
hostname: 92f04537-eu-es.lb.appdomain.cloud
controllercreated: true
and
- type: LoadBalancerReady
status: "True"
severity: ""
lasttransitiontime: "2024-10-28T17:48:55Z"
reason: ""
message: ""
I wrote a test PR: openshift/installer#9145
shows
loadbalancers: p-mad02-1-capi-master-pw5lj-loadbalancer: id: r050-50fb5ff1-a7e0-432d-bc1c-e4848ecd461b state: active hostname: 50fb5ff1-eu-es.lb.appdomain.cloud controllercreated: true p-mad02-1-capi-master-pw5lj-loadbalancer-int: id: r050-92f04537-ce19-4bce-9e8c-ab84c4632c4b state: active hostname: 92f04537-eu-es.lb.appdomain.cloud controllercreated: true
and
- type: LoadBalancerReady status: "True" severity: "" lasttransitiontime: "2024-10-28T17:48:55Z" reason: "" message: ""
Awesome, Thank you for verifying this. I think we should be good to merge this.
/kind bug /area provider/ibmcloud
What steps did you take and what happened: Deploy a 4.18 cluster on a PowerVS zone where LoadBalancers are slow to create. We are called with
InfraReady
. We then create DNS records for the LBs. However, only the public LB exists. So the cluster fails to deploy.What did you expect to happen: We should wait for all specified LBs to become ready.
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]
Environment:
kubectl version
):/etc/os-release
):