Open mquhuy opened 5 months ago
/retitle E2E Fixture flaky with multiple parallel tests
BMO normal ("ironic") tests do not suffer from this issue.
Are you sure about this? I hope it is just an issue with the test-mode but I would not rule out some other concurrency issue in BMO. It may just be less frequent or harder to spot in other tests. Every now and then we have unexplained timeouts while deprovisioning also in CAPM3 e2e tests so who knows :shrug:
BMO normal ("ironic") tests do not suffer from this issue.
Are you sure about this? I hope it is just an issue with the test-mode but I would not rule out some other concurrency issue in BMO. It may just be less frequent or harder to spot in other tests. Every now and then we have unexplained timeouts while deprovisioning also in CAPM3 e2e tests so who knows 🤷
At least I have not seen this happen in ironic test, but I guess I should change the wording. Thank you for the notice xD
/triage accepted
This is not visible on the CI because currently we are setting GINKGO_NODES to 1
What steps did you take and what happened: In the recent weeks we've seen the GH-based fixture tests failing randomly. This seems to happen after we introduced the parallel tests in BMO E2E, and could be reproduced by running
(For
GINKGO_NODES=2
, tests do not always fail)We have disabled the parallel test for fixture in https://github.com/metal3-io/baremetal-operator/pull/1543, but we believe this shows an issue with the BMO test-mode (which is used in fixture test).
In BMO normal ("ironic") tests, similar failure was observed when the number of tests running in parallel is too many for the local machine (for e.g. 3 threads with 3 VMs of 2 vcpus each, on an environment with 8 CPUs).
What did you expect to happen: Fixture test should have passed as long as the number of threads is in the range that the machine where the tests run can handle.
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]
Environment:
/kind bug