Hyperfoil / Horreum

Benchmark results repository service
https://horreum.hyperfoil.io/
Apache License 2.0
34 stars 30 forks source link

Broker test failures #1608

Closed johnaohara closed 3 months ago

johnaohara commented 4 months ago

Describe the bug

In some environments the test suites fails with the following errors:

[ERROR] Failures: 
[ERROR]   AlertingServiceTest.testLabelsChange:710 expected: not <null>
[ERROR]   DatasetServiceTest.testDatasetLabelChanged:188->withExampleSchemas:365->lambda$testDatasetLabelChanged$47:194->BaseServiceTest.withExampleDataset:597->lambda$testDatasetLabelChanged$46:205->waitForUpdate:396 expected: not <null>
[ERROR]   DatasetServiceTest.testSchemaAfterData:257 expected: not <null>
[ERROR]   SchemaServiceTest.testValidateRun:87 expected: not <null>

This is caused by messages that should be processed by the AMQ broker not being emitted to the broker, so the subsequent async processing is not happening during the test(s).

This does not happen in all environments, CI appears to be unaffected, so does Mac with M2 hardware. However F39 on x86_64 appears to be affected.

johnaohara commented 4 months ago

I have opened an issue in quarkus project: https://github.com/quarkusio/quarkus/issues/40118

lampajr commented 4 months ago

I think it is happening from time to time on our CI as well (in all cases, simply retriggering the job was enough), it looks like the same issue I reported https://github.com/Hyperfoil/Horreum/issues/1557 (will close that to avoid duplication)

johnaohara commented 4 months ago

I searched for existing issues, but missed #1557 , sorry for the noise

lampajr commented 4 months ago

I searched for existing issues, but missed #1557 , sorry for the noise

No worries at all, I did not have time to check what could have been the root cause so thanks a lot for doing that as it seems you already the possible culprit :pray:

willr3 commented 4 months ago

Adding to this as I saw additional intermittent failures when running tests on CI

Error:  Failures: 
Error:    SchemaServiceTest.testCreateSchemaAfterRunWithArrayData:307->lambda$testCreateSchemaAfterRunWithArrayData$5:308->lambda$testCreateSchemaAfterRunWithArrayData$4:312 expected: <2> but was: <0>
Error:    SchemaServiceTest.testCreateSchemaAfterRunWithMultipleSchemas:342->lambda$testCreateSchemaAfterRunWithMultipleSchemas$7:343->lambda$testCreateSchemaAfterRunWithMultipleSchemas$6:347 expected: <1> but was: <0>
Error:    SchemaServiceTest.testCreateSchemaAfterRunWithObjectData:376->lambda$testCreateSchemaAfterRunWithObjectData$9:377->lambda$testCreateSchemaAfterRunWithObjectData$8:381 expected: <1> but was: <0>
Error:    SchemaServiceTest.testValidateRun:91 expected: not <null>
[INFO] 
Error:  Tests run: 132, Failures: 4, Errors: 0, Skipped: 0
johnaohara commented 3 months ago

The root cause of these failing tests is due to the broker not having sufficient resources, and applying back-pressure to the tests, see: https://github.com/quarkusio/quarkus/issues/40118#issuecomment-2117821729

We need to ensure that the CI system has sufficient resources for the broker to function, or fail the test if the env does not meet minimum requirements.

johnaohara commented 3 months ago

"Fixed" in : https://github.com/Hyperfoil/Horreum/pull/1713 I think this is the best we can do atm in CI