Closed fao89 closed 2 months ago
Build failed (check pipeline). Post recheck
(without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.
https://review.rdoproject.org/zuul/buildset/9ecb43a0662c4e08ad923064118b6e09
:heavy_check_mark: openstack-k8s-operators-content-provider SUCCESS in 1h 29m 17s :x: podified-multinode-edpm-deployment-crc RETRY_LIMIT in 20m 52s :heavy_check_mark: cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 13m 57s :x: openstack-operator-tempest-multinode RETRY_LIMIT in 23m 24s
/test openstack-operator-build-deploy-kuttl
recheck
I'm fine with it, tested against HCI. I'll run some extra runs be sure it fixes the issue, as it seems unpredictable, despite it showed up for me on each run I did.
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: fao89, jpodivin
The full list of commands accepted by this bot can be found here.
The pull request process is described here
LGTM, but where is the blocker jira for this? Are we not still in blocker-only mode?
LGTM, but where is the blocker jira for this? Are we not still in blocker-only mode?
it was uncovered by OSPCIX-352 (tempest starts to run before the nodeset is ready)
LGTM, but where is the blocker jira for this? Are we not still in blocker-only mode?
it was uncovered by OSPCIX-352 (tempest starts to run before the nodeset is ready)
I had looked at that. If one creates a new deployment after updating/patching the nodeset and then checks the nodeset status, they won't have the issue. The issue in the workflow of that job is that we're checking the status before creating a new deployment.
rabi closed this now
Sorry clicked the wrong button.
actually, they check the status after the deployment: https://github.com/openstack-k8s-operators/architecture/blob/main/automation/vars/default.yaml#L73
2024-07-02 14:06:31,591 p=24790 u=zuul n=ansible | TASK [kustomize_deploy : Apply generated content for examples/va/hci/deployment _raw_params=oc apply -f {{ _cr }}] ***
2024-07-02 14:06:31,591 p=24790 u=zuul n=ansible | Tuesday 02 July 2024 14:06:31 -0400 (0:00:00.073) 0:42:23.561 **********
2024-07-02 14:06:32,072 p=24790 u=zuul n=ansible | changed: [localhost]
2024-07-02 14:06:32,093 p=24790 u=zuul n=ansible | TASK [kustomize_deploy : Run Wait Conditions for examples/va/hci/deployment _raw_params={{ wait_condition }}] ***
2024-07-02 14:06:32,094 p=24790 u=zuul n=ansible | Tuesday 02 July 2024 14:06:32 -0400 (0:00:00.502) 0:42:24.064 **********
2024-07-02 14:06:32,750 p=24790 u=zuul n=ansible | changed: [localhost] => (item=oc -n openstack wait osdpns openstack-edpm --for condition=Ready --timeout=40m)
2024-07-02 14:06:32,768 p=24790 u=zuul n=ansible | TASK [kustomize_deploy : Stop after applying CRs if requested msg=Failing on demand {{ cifmw_deploy_architecture_stopper }}] ***
2024-07-02 14:06:32,768 p=24790 u=zuul n=ansible | Tuesday 02 July 2024 14:06:32 -0400 (0:00:00.674) 0:42:24.738 **********
actually, they check the status after the deployment: https://github.com/openstack-k8s-operators/architecture/blob/main/automation/vars/default.yaml#L73
Then it could be they are checking too quickly before the nodeset could reconcile (after the event from deployment) or maybe the query[1] we're doing does not show the deployment as we're using an empty context (context.Background()).
/cherry-pick 18.0.0-proposed
@fao89: new pull request created: #916
OSPRH-8397