open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.96k stars 860 forks source link

Reduce flakyness of Resource tests #12252

Closed jaydeluca closed 2 months ago

jaydeluca commented 2 months ago

This test is 43% flaky

image

Looking at the failures, it is that the host.arch attribute is not reliable

image

Moving the Security Manager tests to their own test suite to avoid them altering the environment, causing issues with other tests

laurit commented 2 months ago

I think this is not a good way to fix this issue. The reason this test is flaky (it actually reliably fails on the first execution and passes on retry) is that it depends on the order tests are run. If https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/2d5775ae8a4d5c2d630292889e3f1d02ac1a2ab9/instrumentation/resources/library/src/test/java/io/opentelemetry/instrumentation/resources/HostResourceTest.java#L40 is run first it will poison the environment for this test. Even if you ignore host.arch you'll next run into the same issue with os.description (at least that is what happened when I disabled HostResourceTest.SecurityManagerEnabled). Perhaps we should move all the security manager tests to a separate suite so they couldn't interfere with the main suite. I can reliably reproduce this with running ./gradlew :instrumentation:resources:library:clean and ./gradlew :instrumentation:resources:library:test --no-build-cache -PtestJavaVersion=8