Closed JonasKunz closed 2 years ago
The race-condition here is that MockTracer.createRealTracer(reporter) starts at least one background thread which reads from the ReporterConfiguration, for example the ApmServerHealthChecker while we at the same time try to configure the mocks.
This is a long standing source of flakiness. Kudos for finding out the root cause 👏
Cloud we disable the background tasks, such as health checking within MockTracer.createRealTracer
? Another option would be to block inside MockTracer.createRealTracer
until the background tasks are finished.
This is a long standing source of flakiness
In that case I will invest the time and try implementing your suggested fixes.
In case of a regression, this specific issue can be reproduced reliably via the following steps:
ReporterConfiguration
to the ApmServerClient: try {
Thread.sleep(50);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
String secretToken = reporterConfiguration.getSecretToken();
@Test
public void provokeRaceCondition() {
while (true) {
MockReporter reporter = new MockReporter();
ElasticApmTracer tracer = MockTracer.createRealTracer(reporter);
for (int i = 0; i < 100_000; i++) {
doReturn(0L).when(tracer.getConfig(ReporterConfiguration.class)).getMetricsIntervalMs();
}
tracer.stop();
}
}
Test method
All tests which use
MockTracer.createRealTracer(reporter)
and do mocking of theReporterConfiguration
are affected.The concrete example from the logs would be all tests from
MicrometerMetricsReporterTest
.Test logs
Click to expand
``` 15:23:58 [ERROR] Tests run: 27, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 5.221 s <<< FAILURE! - in co.elastic.apm.agent.micrometer.MicrometerMetricsReporterTest 15:23:58 [ERROR] testExclusionOfFailedGauge_singleGauge Time elapsed: 0.217 s <<< ERROR! 15:23:58 org.mockito.exceptions.misusing.WrongTypeOfReturnValue: 15:23:58 15:23:58 Long cannot be returned by getApiKey() 15:23:58 getApiKey() should return String 15:23:58 *** 15:23:58 If you're unsure why you're getting above error read on. 15:23:58 Due to the nature of the syntax above problem might occur because: 15:23:58 1. This exception *might* occur in wrongly written multi-threaded tests. 15:23:58 Please refer to Mockito FAQ on limitations of concurrency testing. 15:23:58 2. A spy is stubbed using when(spy.foo()).then() syntax. It is safer to stub spies - 15:23:58 - with doReturn|Throw() family of methods. More in javadocs for Mockito.spy() method. 15:23:58 15:23:58 at co.elastic.apm.agent.micrometer.MicrometerMetricsReporterTest.setUp(MicrometerMetricsReporterTest.java:76) 15:23:58 at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 15:23:58 at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 15:23:58 at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 15:23:58 at java.base/java.lang.reflect.Method.invoke(Method.java:566) 15:23:58 at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727) 15:23:58 at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60) 15:23:58 at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131) 15:23:58 at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156) 15:23:58 at org.junit.jupiter.engine.extension.TimeoutExtension.interceptLifecycleMethod(TimeoutExtension.java:128) 15:23:58 at org.junit.jupiter.engine.extension.TimeoutExtension.interceptBeforeEachMethod(TimeoutExtension.java:78) 15:23:58 at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103) 15:23:58 at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93) 15:23:58 at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106) 15:23:58 at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64) 15:23:58 at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45) 15:23:58 at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37) 15:23:58 at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92) 15:23:58 at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86) 15:23:58 at org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.invokeMethodInExtensionContext(ClassBasedTestDescriptor.java:520) 15:23:58 at org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.lambda$synthesizeBeforeEachMethodAdapter$23(ClassBasedTestDescriptor.java:505) 15:23:58 at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeBeforeEachMethods$3(TestMethodTestDescriptor.java:174) 15:23:58 at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeBeforeMethodsOrCallbacksUntilExceptionOccurs$6(TestMethodTestDescriptor.java:202) 15:23:58 at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) 15:23:58 at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeBeforeMethodsOrCallbacksUntilExceptionOccurs(TestMethodTestDescriptor.java:202) 15:23:58 at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeBeforeEachMethods(TestMethodTestDescriptor.java:171) 15:23:58 at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:134) 15:23:58 at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151) 15:23:58 at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141) 15:23:58 at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139) 15:23:58 at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95) 15:23:58 at java.base/java.util.ArrayList.forEach(ArrayList.java:1541) 15:23:58 at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155) 15:23:58 at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141) 15:23:58 at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139) 15:23:58 at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95) 15:23:58 at java.base/java.util.ArrayList.forEach(ArrayList.java:1541) 15:23:58 at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155) 15:23:58 at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141) 15:23:58 at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139) 15:23:58 at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138) 15:23:58 at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95) 15:23:58 at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:35) 15:23:58 at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57) 15:23:58 at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:54) 15:23:58 at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:220) 15:23:58 at org.junit.platform.launcher.core.DefaultLauncher.lambda$execute$6(DefaultLauncher.java:188) 15:23:58 at org.junit.platform.launcher.core.DefaultLauncher.withInterceptedStreams(DefaultLauncher.java:202) 15:23:58 at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:181) 15:23:58 at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:128) 15:23:58 at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invokeAllTests(JUnitPlatformProvider.java:150) 15:23:58 at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invoke(JUnitPlatformProvider.java:124) 15:23:58 at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) 15:23:58 at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) 15:23:58 at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) 15:23:58 at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) ```Additional context
The root cause seems to be a race-condition within Mockito, even when using the
doReturn
-syntax.In this concrete example the problem is caused by these two lines :
The race-condition here is that
MockTracer.createRealTracer(reporter)
starts at least one background thread which reads from theReporterConfiguration
, for example theApmServerHealthChecker
while we at the same time try to configure the mocks.The
ApmServerHealthChecker
creates a background request to the APM-server on a different Thread, where it reads the configuration from theReporterConfiguration
(this line):The race-condition occurs when the
getApiKey()
call is performed before thegetMetricsIntervalMs()
call, but afterdoReturn(0L).when(tracer.getConfig(ReporterConfiguration.class))
. In this case Mockito thinks that we are trying to stubgetApiKey()
.I have the suspicion that the
when()
part of mockito is actually not thread safe for our case.