openshift / oadp-operator

OADP Operator
Apache License 2.0
77 stars 70 forks source link

delete the existing backuprepositories prior to a new test #1521

Open weshayutin opened 2 weeks ago

weshayutin commented 2 weeks ago

Why the changes were made

delete the backuprepository for cirros-test namespace if found. The e2e tests use a new bsl for every test run but the backuprepository can be stale and prevent backups from passing. We need to delete backuprepository as well.

How to test the changes made

make test-e2e with virt settings

openshift-ci[bot] commented 2 weeks ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: weshayutin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[tests/e2e/OWNERS](https://github.com/openshift/oadp-operator/blob/master/tests/e2e/OWNERS)~~ [weshayutin] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
mateusoliveira43 commented 1 week ago

Is this a VIRT only problem?

weshayutin commented 1 week ago

Is this a VIRT only problem?

It's exposed with Virt simply because we have so many tests using the same namespace. It should also be a problem with DM tests that use the namespace. There is nothing bad about removing the backuprepo object in between tests.

weshayutin commented 1 week ago

/retest

weshayutin commented 1 week ago

Timed out waiting for the catalog source oo-www55 to become ready after 10 minutes.\n[2024-09-18T23:16:44.309Z] Catalogsource state at timeout is \"TRANSIENT_FAILURE\"\n[2024-09-18T23:16:44.311Z] Catalogsource image used is \"registry.build05.ci.openshift.org/ci-op-qq00cxzj/pipeline@sha256:0e8b8e0a00673610f6f256513df87cdafbd2a7d973ed5c5f1e6ea2c318964667\"\n[2024-09-18T23:16:44.314Z] All retry attempts failed\n[2024-09-18T23:16:44.316Z] Script Completed Execution With Failures !\n{\"component\":\"entrypoint\",\"error\":\"wrapped process failed: exit status 1\",\"file\":\"sigs.k8s.io/prow/pkg/entrypoint/run.go:84\",\"func\":\"sigs.k8s.io/prow/pkg/entrypoint.Options.internalRun\",\"level\":\"error\",\"msg\":\"Error executing test process\",\"severity\":\"error\",\"time\":\"2024-09-18T23:16:44Z\"}\nerror: failed to execute wrapped command: exit status 1\n---\nLink to step on registry info site: https://steps.ci.openshift.org/reference/optional-operators-subscribe\nLink to job on registry info site: https://steps.ci.openshift.org/job?org=openshift\u0026repo=oadp-operator\u0026branch=master\u0026test=e2e-test-aws\u0026variant=4.15","time":"2024-09-18T23:26:48Z"} {"level":"info","msg":"Reporting job state 'failed' with reason 'executing_graph:step_failed:utilizing_lease:executing_test:utilizing_ip_pool:executing_test:executing_multi_stage_test'","time":"2024-09-18T23:26:48Z"}

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_oadp-operator/1521/pull-ci-openshift-oadp-operator-master-4.15-e2e-test-aws/1836520913516892160/artifacts/ci-operator.log

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_oadp-operator/1521/pull-ci-openshift-oadp-operator-master-4.15-e2e-test-kubevirt-aws/1836520913571418112/artifacts/ci-operator.log

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_oadp-operator/1521/pull-ci-openshift-oadp-operator-master-4.16-e2e-test-aws/1836520913604972544/artifacts/ci-operator.log

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_oadp-operator/1521/pull-ci-openshift-oadp-operator-master-4.16-e2e-test-kubevirt-aws/1836520913646915584/artifacts/ci-operator.log

kaovilai commented 1 week ago

/retest

kaovilai commented 1 week ago

would simplify to just delete every backupRepository in OADP namespace

wouldn't this be too much cleanup and not representative of a real world usage?

weshayutin commented 1 week ago

would simplify to just delete every backupRepository in OADP namespace

wouldn't this be too much cleanup and not representative of a real world usage?

We can and should have multiple b/r on the same namespace and we can write those tests. None of the current tests are designed for that atm imho. Each test is meant to be an initial backup. I am rewriting this to kill all the backuprepository's found.

kaovilai commented 1 week ago

We actually run multiple backups on the same BSL and so currently backup repositories is reused. I am ok with this change, we can explicitly test in future tests

weshayutin commented 5 days ago

hrm.. I wonder if deleting the backuprepository is causing:

  Backup Item Operations:
    Operation for persistentvolumeclaims mysql-persistent/mysql:
      Backup Item Action Plugin:  velero.io/csi-pvc-backupper
      Operation ID:               du-f9a5ee7a-3f6b-4243-b05f-2e3d7a4a2115.9f553335-75da-416211efa
      Items to Update:
                             datauploads.velero.io openshift-adp/mysql-datamover-e2e-7f25cc20-7787-11ef-b207-0a580a81e42a-qhrnf
      Phase:                 Failed
      Operation Error:       data path backup failed: Failed to run data path service for DataUpload mysql-datamover-e2e-7f25cc20-7787-11ef-b207-0a580a81e42a-qhrnf: Data path for data upload mysql-datamover-e2e-7f25cc20-7787-11ef-b207-0a580a81e42a-qhrnf failed: Failed to run kopia backup: Failed to upload the kopia snapshot for si default@default:snapshot-data-upload-download/kopia/mysql-persistent/mysql: permission denied
      Progress description:  Failed
weshayutin commented 5 days ago

/retest

openshift-ci[bot] commented 5 days ago

@weshayutin: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/4.15-e2e-test-aws a762df0d76734a49d3f1d6e58ab2721e6d04c0d7 link true /test 4.15-e2e-test-aws
ci/prow/4.16-e2e-test-aws a762df0d76734a49d3f1d6e58ab2721e6d04c0d7 link true /test 4.16-e2e-test-aws
ci/prow/4.14-e2e-test-aws a762df0d76734a49d3f1d6e58ab2721e6d04c0d7 link true /test 4.14-e2e-test-aws

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).