rh-messaging-qe / claire

3 stars 2 forks source link

Tests often fail on StatefulSet timeout, without revealing useful information up-front #8

Open jiridanek opened 1 year ago

jiridanek commented 1 year ago
io.brokerqe.claire.exception.WaitException: Timeout after 60000 ms waiting for StatefulSet to be ready
    at io.brokerqe.claire.TestUtils.waitFor(TestUtils.java:173)
    at io.brokerqe.claire.TestUtils.waitFor(TestUtils.java:124)
    at io.brokerqe.claire.ResourceManager.waitForBrokerDeployment(ResourceManager.java:463)
    at io.brokerqe.claire.ResourceManager.waitForBrokerDeployment(ResourceManager.java:452)
    at io.brokerqe.claire.ResourceManager.waitForBrokerDeployment(ResourceManager.java:448)
    at io.brokerqe.claire.configuration.BrokerConfigurationTests.verifyResourceUpdates(BrokerConfigurationTests.java:287)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
    at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
2023-11-01T20:51:52.479Z &amp#27;[34mINFO &amp#27;[0;39m [TestSeparator:30] &amp#27;[34m----------------------------------------------------------------------------&amp#27;[0m
2023-11-01T20:51:52.480Z &amp#27;[34mINFO &amp#27;[0;39m [TestSeparator:31] &amp#27;[33m[56/96] Started: io.brokerqe.claire.configuration.BrokerConfigurationTests.verifyResourceUpdates&amp#27;[0m
2023-11-01T20:51:52.480Z &amp#27;[31mWARN &amp#27;[0;39m [AbstractSystemTests:283] [brkconfig-tests-6b8] Stripping broker name from cfg-broker-verifyresourceupdates to cfg-broker-verifyresourceu due to length limitations
2023-11-01T20:51:52.497Z &amp#27;[34mINFO &amp#27;[0;39m [ResourceManager:276] Created ActiveMQArtemis ActiveMQArtemis() in namespace brkconfig-tests-6b8
2023-11-01T20:51:52.497Z &amp#27;[34mINFO &amp#27;[0;39m [ResourceManager:456] [brkconfig-tests-6b8] Waiting 60s for creation of broker cfg-broker-verifyresourceu
2023-11-01T20:51:52.497Z &amp#27;[39mDEBUG&amp#27;[0;39m [TestUtils:128] Waiting for StatefulSet to be ready
2023-11-01T20:52:28.738Z &amp#27;[34mINFO &amp#27;[0;39m [ResourceManager:456] [brkconfig-tests-6b8] Waiting 60s for creation of broker cfg-broker-verifyresourceu
2023-11-01T20:52:28.739Z &amp#27;[34mINFO &amp#27;[0;39m [ResourceManager:460] [brkconfig-tests-6b8] Reloading existing broker cfg-broker-verifyresourceu, sleeping for some time
2023-11-01T20:52:33.739Z &amp#27;[39mDEBUG&amp#27;[0;39m [TestUtils:128] Waiting for StatefulSet to be ready
2023-11-01T20:53:33.753Z &amp#27;[34mINFO &amp#27;[0;39m [OperatorTestDataCollector:50] Error detected will gather! data from namespace: brkconfig-tests-6b8
2023-11-01T20:53:33.753Z &amp#27;[39mDEBUG&amp#27;[0;39m [OperatorTestDataCollector:55] [brkconfig-tests-6b8] Gathering debug data for failed io.brokerqe.claire.configuration.BrokerConfigurationTests#verifyResourceUpdates into /home/jenkins/agent/workspace/claire/downstream/latest/operator/ocp-4.10/claire/operator-suite/test-logs/2023-11-01_19:55:30/io.brokerqe.claire.configuration.BrokerConfigurationTests.verifyResourceUpdates/brkconfig-tests-6b8
2023-11-01T20:53:36.723Z &amp#27;[39mDEBUG&amp#27;[0;39m [ExecutorOperator:77] [brkconfig-tests-6b8] cfg-broker-verifyresourceu-ss-0 Running command: /bin/bash -c find /amq/extra/ -type f
2023-11-01T20:53:37.072Z &amp#27;[34mINFO &amp#27;[0;39m [TestSeparator:38] &amp#27;[33mFinished: io.brokerqe.claire.configuration.BrokerConfigurationTests.verifyResourceUpdates&amp#27;[0m
2023-11-01T20:53:37.073Z &amp#27;[34mINFO &amp#27;[0;39m [TestSeparator:39] &amp#27;[34m----------------------------------------------------------------------------&amp#27;[0m

another failure in different test

io.brokerqe.claire.exception.WaitException: Timeout after 1320000 ms waiting for Drain pod to finish
    at io.brokerqe.claire.TestUtils.waitFor(TestUtils.java:173)
    at io.brokerqe.claire.TestUtils.waitFor(TestUtils.java:124)
    at io.brokerqe.claire.AbstractSystemTests.waitForScaleDownDrainer(AbstractSystemTests.java:236)
    at io.brokerqe.claire.AbstractSystemTests.doArtemisScale(AbstractSystemTests.java:207)
    at io.brokerqe.claire.scalability.ScalabilityTests.simpleScalabilityTest(ScalabilityTests.java:43)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
    at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
jiridanek commented 1 year ago

The idea is that when TestUtils.waitFor or ResourceManager.waitForBrokerDeployment fail, they should try to surface some basic status information about the object they were waiting for.

Maybe, log expected state and actual state, or just actual state if expected state is impossible to determine.

This way, additional exception can be added into the exception chain, with some brief info and a prompt to look into logs for more details.

jiridanek commented 1 year ago
jiridanek commented 11 months ago

Lots of various stability fixes since, last one https://github.com/rh-messaging-qe/claire/commit/d0225fbd98f939bf99419c19716fdf3a52549d55