Open akutz opened 1 year ago
Hi @yi0909 and @dilyar85,
Maybe we should file one or two more issues to at least try addressing the two flakes about which we are readily aware? It's been four runs, and this job keeps hitting these flakes:
Attempt #1 failed with:
E1204 20:57:56.551859 10512 contentsource_controller.go:394] controllers/ContentSource "msg"="error in reconciling the provider ref" "error"="ContentLibraryProvider.vmoperator.vmware.com \"dummy-cl\" not found" "name"="dummy-cs"
E1204 20:57:56.552069 10512 controller.go:317] controller/contentsource "msg"="Reconciler error" "error"="ContentLibraryProvider.vmoperator.vmware.com \"dummy-cl\" not found" "name"="dummy-cs" "namespace"="" "reconciler group"="vmoperator.vmware.com" "reconciler kind"="ContentSource"
I1204 20:57:56.552223 10512 contentsource_controller.go:444] controllers/ContentSource "msg"="Received reconcile request" "name"="dummy-cs-new"
I1204 20:57:56.552328 10512 contentsource_controller.go:418] controllers/ContentSource "msg"="Reconciling ContentSource deletion" "name"="dummy-cs-new"
I1204 20:57:56.560937 10512 contentsource_controller.go:415] controllers/ContentSource "msg"="Finished Reconciling ContentSource Deletion" "name"="dummy-cs-new"
I1204 20:57:56.561124 10512 contentsource_controller.go:444] controllers/ContentSource "msg"="Received reconcile request" "name"="dummy-cs"
I1204 20:57:56.561238 10512 contentsource_controller.go:418] controllers/ContentSource "msg"="Reconciling ContentSource deletion" "name"="dummy-cs"
I1204 20:57:56.567905 10512 contentsource_controller.go:415] controllers/ContentSource "msg"="Finished Reconciling ContentSource Deletion" "name"="dummy-cs"
I1204 20:57:56.569568 10512 contentsource_controller.go:444] controllers/ContentSource "msg"="Received reconcile request" "name"="dummy-cs-new"
I1204 20:57:56.569769 10512 contentsource_controller.go:444] controllers/ContentSource "msg"="Received reconcile request" "name"="dummy-cs"
STEP: Creating a temporary namespace
STEP: Destroying temporary namespace
------------------------------
• Failure [10.302 seconds]
Integration tests
/home/runner/work/vm-operator/vm-operator/test/builder/test_suite.go:251
Reconcile ContentSource
/home/runner/work/vm-operator/vm-operator/controllers/contentlibrary/contentsource/contentsource_controller_intg_test.go:165
when ContentSource and ContentLibraryProvider exists
/home/runner/work/vm-operator/vm-operator/controllers/contentlibrary/contentsource/contentsource_controller_intg_test.go:166
when a new ContentSource with duplicate vm images is created
/home/runner/work/vm-operator/vm-operator/controllers/contentlibrary/contentsource/contentsource_controller_intg_test.go:279
should reconcile and generate a new VirtualMachineImage object [It]
/home/runner/work/vm-operator/vm-operator/controllers/contentlibrary/contentsource/contentsource_controller_intg_test.go:319
Timed out after 10.001s.
Expected
<int>: 3
to equal
<int>: 2
/home/runner/work/vm-operator/vm-operator/controllers/contentlibrary/contentsource/contentsource_controller_intg_test.go:326
------------------------------
I1204 20:57:56.676854 10512 logr.go:249] "msg"="Stopping and waiting for non leader election runnables"
I1204 20:57:56.676931 10512 logr.go:249] "msg"="Stopping and waiting for leader election runnables"
I1204 20:57:56.677043 10512 controller.go:240] controller/contentsource "msg"="Shutdown signal received, waiting for all workers to finish" "reconciler group"="vmoperator.vmware.com" "reconciler kind"="ContentSource"
I1204 20:57:56.677134 10512 controller.go:242] controller/contentsource "msg"="All workers finished" "reconciler group"="vmoperator.vmware.com" "reconciler kind"="ContentSource"
I1204 20:57:56.677180 10512 logr.go:249] "msg"="Stopping and waiting for caches"
I1204 20:57:56.677823 10512 logr.go:249] "msg"="Stopping and waiting for webhooks"
I1204 20:57:56.677917 10512 logr.go:249] "msg"="Wait completed, proceeding to shutdown the manager"
Summarizing 1 Failure:
[Fail] Integration tests Reconcile ContentSource when ContentSource and ContentLibraryProvider exists when a new ContentSource with duplicate vm images is created [It] should reconcile and generate a new VirtualMachineImage object
/home/runner/work/vm-operator/vm-operator/controllers/contentlibrary/contentsource/contentsource_controller_intg_test.go:326
Ran 3 of 3 Specs in 21.390 seconds
FAIL! -- 2 Passed | 1 Failed | 0 Pending | 0 Skipped
--- FAIL: TestContentSource (21.41s)
FAIL
coverage: 3.3% of statements in ./controllers/..., ./pkg/..., ./webhooks/...
FAIL github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/contentsource 21.653s
Attempt #2 failed with:
------------------------------
• Failure [10.330 seconds]
Integration tests
/home/runner/work/vm-operator/vm-operator/test/builder/test_suite.go:251
Reconcile ContentSource
/home/runner/work/vm-operator/vm-operator/controllers/contentlibrary/contentsource/contentsource_controller_intg_test.go:165
when ContentSource and ContentLibraryProvider exists
/home/runner/work/vm-operator/vm-operator/controllers/contentlibrary/contentsource/contentsource_controller_intg_test.go:166
when a new ContentSource with duplicate vm images is created
/home/runner/work/vm-operator/vm-operator/controllers/contentlibrary/contentsource/contentsource_controller_intg_test.go:279
should reconcile and generate a new VirtualMachineImage object [It]
/home/runner/work/vm-operator/vm-operator/controllers/contentlibrary/contentsource/contentsource_controller_intg_test.go:319
Timed out after 10.000s.
Expected
<int>: 3
to equal
<int>: 2
/home/runner/work/vm-operator/vm-operator/controllers/contentlibrary/contentsource/contentsource_controller_intg_test.go:326
------------------------------
I1204 21:28:11.588278 10617 logr.go:249] "msg"="Stopping and waiting for non leader election runnables"
I1204 21:28:11.588433 10617 logr.go:249] "msg"="Stopping and waiting for leader election runnables"
I1204 21:28:11.588729 10617 controller.go:240] controller/contentsource "msg"="Shutdown signal received, waiting for all workers to finish" "reconciler group"="vmoperator.vmware.com" "reconciler kind"="ContentSource"
I1204 21:28:11.588917 10617 controller.go:242] controller/contentsource "msg"="All workers finished" "reconciler group"="vmoperator.vmware.com" "reconciler kind"="ContentSource"
I1204 21:28:11.589065 10617 logr.go:249] "msg"="Stopping and waiting for caches"
I1204 21:28:11.590610 10617 logr.go:249] "msg"="Stopping and waiting for webhooks"
I1204 21:28:11.591016 10617 logr.go:249] "msg"="Wait completed, proceeding to shutdown the manager"
Summarizing 1 Failure:
[Fail] Integration tests Reconcile ContentSource when ContentSource and ContentLibraryProvider exists when a new ContentSource with duplicate vm images is created [It] should reconcile and generate a new VirtualMachineImage object
/home/runner/work/vm-operator/vm-operator/controllers/contentlibrary/contentsource/contentsource_controller_intg_test.go:326
Ran 3 of 3 Specs in 21.310 seconds
FAIL! -- 2 Passed | 1 Failed | 0 Pending | 0 Skipped
--- FAIL: TestContentSource (21.34s)
FAIL
coverage: 3.3% of statements in ./controllers/..., ./pkg/..., ./webhooks/...
FAIL github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/contentsource 21.568s
Attempt #3 failed with:
•I1204 21:45:02.627894 14199 response.go:42] vmoperator-controller-manager/default-validate-vmoperator-vmware-com-v1alpha1-virtualmachinepublishrequest/8466f652-e290-4531-ba96-7585709a161b/dummy-vmpub "msg"="validation denied" "code"=422 "reason"="spec.target: Invalid value: v1alpha1.VirtualMachinePublishRequestTarget{Item:v1alpha1.VirtualMachinePublishRequestTargetItem{Name:\"dummy-item\", Description:\"\"}, Location:v1alpha1.VirtualMachinePublishRequestTargetLocation{Name:\"alternate-cl\", APIVersion:\"imageregistry.vmware.com/v1alpha1\", Kind:\"ContentLibrary\"}}: field is immutable"
•I1204 21:45:02.646661 14199 response.go:42] vmoperator-controller-manager/default-validate-vmoperator-vmware-com-v1alpha1-virtualmachinepublishrequest/86b9dd67-a2e8-461b-8459-4fb6f464ff5c/dummy-vmpub "msg"="validation denied" "code"=422 "reason"="spec.source.name: Not found: \"dummy-vm\""
STEP: Creating a temporary namespace
------------------------------
• Failure in Spec Setup (BeforeEach) [0.020 seconds]
Integration tests
/home/runner/work/vm-operator/vm-operator/test/builder/test_suite.go:251
Invoking Delete
/home/runner/work/vm-operator/vm-operator/webhooks/virtualmachinepublishrequest/validation/virtualmachinepublishrequest_validator_intg_test.go:21
when delete is performed [BeforeEach]
/home/runner/work/vm-operator/vm-operator/webhooks/virtualmachinepublishrequest/validation/virtualmachinepublishrequest_validator_intg_test.go:174
should allow the request
/home/runner/work/vm-operator/vm-operator/webhooks/virtualmachinepublishrequest/validation/virtualmachinepublishrequest_validator_intg_test.go:175
Unexpected error:
<*errors.StatusError | 0xc000b53680>: {
ErrStatus: {
TypeMeta: {Kind: "", APIVersion: ""},
ListMeta: {
SelfLink: "",
ResourceVersion: "",
Continue: "",
RemainingItemCount: nil,
},
Status: "Failure",
Message: "admission webhook \"default.validating.virtualmachinepublishrequest.vmoperator.vmware.com\" denied the request: spec.source.name: Not found: \"dummy-vm\"",
Reason: "spec.source.name: Not found: \"dummy-vm\"",
Details: nil,
Code: 422,
},
}
admission webhook "default.validating.virtualmachinepublishrequest.vmoperator.vmware.com" denied the request: spec.source.name: Not found: "dummy-vm"
occurred
/home/runner/work/vm-operator/vm-operator/webhooks/virtualmachinepublishrequest/validation/virtualmachinepublishrequest_validator_intg_test.go:162
------------------------------
I1204 21:45:02.651259 14199 logr.go:249] "msg"="Stopping and waiting for non leader election runnables"
I1204 21:45:02.651369 14199 logr.go:249] "msg"="Stopping and waiting for leader election runnables"
I1204 21:45:02.651467 14199 logr.go:249] "msg"="Stopping and waiting for caches"
I1204 21:45:02.651751 14199 logr.go:249] "msg"="Stopping and waiting for webhooks"
I1204 21:45:02.652300 14199 logr.go:249] controller-runtime/webhook "msg"="shutting down webhook server"
I1204 21:45:02.654116 14199 logr.go:249] "msg"="Wait completed, proceeding to shutdown the manager"
Summarizing 1 Failure:
[Fail] Integration tests Invoking Delete [BeforeEach] when delete is performed should allow the request
/home/runner/work/vm-operator/vm-operator/webhooks/virtualmachinepublishrequest/validation/virtualmachinepublishrequest_validator_intg_test.go:162
Ran 5 of 5 Specs in 13.425 seconds
FAIL! -- 4 Passed | 1 Failed | 0 Pending | 0 Skipped
I am currently on the fourth attempt; fingers crossed!
I've noticed a bunch of new, flaky tests related to the content source and vmpubreq controllers integration tests and seem related to poorly implemented Ginkgo. For example, from the first run of the integration test job for PR #26:
Re-running just the IT job usually clears up flakes like above. I believe these are occurring because our tests are race-y, and once people start creating PRs in this project, we will see these errors more frequently. When that happens:
This is usually enough to fix things. However, I want to set a goal that we run Ginkgo with the -p flag, which enables suite parallelism. This would very quickly identify all of the issues we have related to the way we've constructed our tests.
This issue tracks the need to enable parallism for our tests suites.