openstack-k8s-operators / openstack-operator

Meta Operator for OpenStack
https://openstack-k8s-operators.github.io/openstack-operator/
Apache License 2.0
27 stars 76 forks source link

Add labels and defaults to DataPlaneService from EnsureServices #958

Closed slagle closed 2 months ago

slagle commented 2 months ago

Fixes a reconcile loop by not modifying existing DataPlaneServices in the CreateOrPatch call from EnsureService. If the webhook modifies the service instead, that seems to trigger another NodeSet reconcile which creates a loop.

Jira: OSPRH-8811 Signed-off-by: James Slagle jslagle@redhat.com

softwarefactory-project-zuul[bot] commented 2 months ago

Build failed (check pipeline). Post recheck (without leading slash) to rerun all jobs. Make sure the failure cause has been resolved before you rerun jobs.

https://review.rdoproject.org/zuul/buildset/6cef151bb28740809bdb956b2c0f7cf8

:heavy_check_mark: openstack-k8s-operators-content-provider SUCCESS in 42m 58s :x: podified-multinode-edpm-deployment-crc RETRY_LIMIT in 18m 18s :x: cifmw-crc-podified-edpm-baremetal FAILURE in 26m 29s :x: openstack-operator-tempest-multinode RETRY_LIMIT in 21m 48s

bshephar commented 2 months ago

/test openstack-operator-build-deploy-kuttl

rabi commented 2 months ago

that seems to trigger another NodeSet reconcile which creates a loop.

Hmm.. Not sure why would service update/patch would trigger a nodeset reconciliation as nodeset controller neither owns or watches services? There is something strange going on for sure, if deleting the MutatingWebhookConfiguration stops the reconcile loop as the webhook configuration would be created again after deletion.

gibizer commented 2 months ago

recheck

slagle commented 2 months ago

that seems to trigger another NodeSet reconcile which creates a loop.

Hmm.. Not sure why would service update/patch would trigger a nodeset reconciliation as nodeset controller neither owns or watches services? There is something strange going on for sure, if deleting the MutatingWebhookConfiguration stops the reconcile loop as the webhook configuration would be created again after deletion.

I deleted the webhook from the CSV, so it does not come back, unless you delete/recreate the Subscription.

Either way, I agree this is an unexplained behavior as I didn't think a webhook should trigger another reconcile loop, but it seems to based on the observations.

rabi commented 2 months ago

I deleted the webhook from the CSV, so it does not come back, unless you delete/recreate the Subscription.

Deleting the webhook with oc delete would bring it back and it also seems to stop the reconcile loop in spite of the webhook re-appearing. Also, if you patch a service manually (not in the nodeset controller as we do), mutating webhook is called and the fields are defaulted, but it does not do the nodeset reconcile and hence the loop.

openshift-ci[bot] commented 2 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bshephar, jpodivin, slagle

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openstack-k8s-operators/openstack-operator/blob/main/OWNERS)~~ [bshephar,jpodivin,slagle] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
slagle commented 2 months ago

/cherrypick 18.0.0-proposed

openshift-cherrypick-robot commented 2 months ago

@slagle: new pull request created: #964

In response to [this](https://github.com/openstack-k8s-operators/openstack-operator/pull/958#issuecomment-2243613157): >/cherrypick 18.0.0-proposed Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
rabi commented 2 months ago

I agree this is an unexplained behavior as I didn't think a webhook should trigger another reconcile loop, but it seems to based on the observations.

https://github.com/openstack-k8s-operators/openstack-operator/pull/968