MarathonLabs / marathon

Cross-platform test runner
https://docs.marathonlabs.io
GNU General Public License v2.0
584 stars 122 forks source link

retryPerTestQuota doesn't work #918

Closed dlykoits closed 7 months ago

dlykoits commented 7 months ago

Describe the bug The retry is configured with:

retryStrategy {
            fixedQuota {
                retryPerTestQuota = 1
                totalAllowedRetryQuota = 100
            }
        }

But according to the report: test(testDeeplinkWithActionToLiveTvChannel) starts 3 times(2 fails), see final output:

Device pool omni:
    248 passed, 0 failed, 0 ignored tests
    Flakiness overhead: 0H 01m 48s
    Raw: 248 passed, 5 failed, 0 ignored, 0 incomplete tests
    Failed tests:
        ***.DeferredCustomDeeplinkTest#testCustomDeeplinkPlayTvShowSeasonEpisode[CRICKET_EULA_POSITIVE] failed 1 time(s)
        ***.MobileCustomDeeplinkTest#testCustomDeeplinkWithActionToLiveTvChannel[CLEARED_APP] failed 1 time(s)
        ***.MobileDeeplinkTest#testDeeplinkWithActionToLiveTvChannel[CLEARED_APP] failed 2 time(s)
        ***.MobileDeeplinkTest#testDeeplinkWithActionToLiveTvChannel[RUNNING_APP_FOREGROUND_ACTIVITY] failed 1 time(s)
Total time: 0H 56m 45s

It's expected to have only one retry instead of 2, am I right or missing something?

Malinskiy commented 7 months ago

Can you please specify version used?

If you are using multiple devices then this behavior is expected. If however only one device is used then Id like to investigate more

dlykoits commented 7 months ago

Only one device is used:

image

Maybe it's related to parametrized tests, I'll try to create sample for investigation. Marathon version is 0.10.0 Full config is:

    marathon {
        debug = true
        executionStrategy = ExecutionStrategyConfiguration(ExecutionMode.ANY_SUCCESS)
        retryStrategy {
            fixedQuota {
                retryPerTestQuota = 1
                totalAllowedRetryQuota = 100
            }
        }
        testParserConfiguration = RemoteTestParserConfiguration(
            mapOf("listener" to "com.malinskiy.adam.junit4.android.listener.TestAnnotationProducer")
        )
        applicationPmClear = true
        disableWindowAnimation = true
        testApplicationPmClear = true
        autoGrantPermission = true
        uncompletedTestRetryQuota = 10
        ignoreFailures = true
        testAccessConfiguration = TestAccessConfiguration(adb = true, grpc = true, console = true)
        filteringConfiguration {
            allowlist {
                add(AnnotationFilterConfiguration(values = listOf("*.MarathonRun")))
            }
            blocklist {
                add(AnnotationFilterConfiguration(values = listOf("*.MarathonSkip")))
            }
        }
    }
Malinskiy commented 7 months ago

Can you please create a sample project or modify the existing one to reproduce this?

If you're keen to, you can also try adding a failing test to either https://github.com/MarathonLabs/marathon/blob/0b0575c069dce79d5dbbab46d9c5a9c81b4fd49e/core/src/test/kotlin/com/malinskiy/marathon/execution/strategy/impl/retry/fixedquota/FixedQuotaRetryStrategyTest.kt#L22 or https://github.com/MarathonLabs/marathon/blob/0b0575c069dce79d5dbbab46d9c5a9c81b4fd49e/core/src/test/kotlin/com/malinskiy/marathon/execution/progress/PoolProgressAccumulatorTest.kt#L23 depending on how you want to simulate this type of situation

Either one will help me solve this issue quickly. Thanks!

dlykoits commented 7 months ago

Interesting thing, that I can't repro it at the test project, I'll try again... But maybe I can provide some logs or details. Do you need it? As I see, the retries do not happen immediately, but after some other tests are completed. Is this expected?

Malinskiy commented 7 months ago

No execution order is guaranteed except for a weak one via sorting strategy.

Unless this is reproducible I can't propose a fix unfortunately

dlykoits commented 7 months ago

I found the way to repro: Please check the branch I've added class TestSuite and when I run ./gradlew marathonDebugAndroidTest -> it causes one extra retry. In our project we have multiple TestSuites.

Malinskiy commented 7 months ago

Turns out you've found a bug in AOSP, namely when test is specified via test suite then it will be executed more than one time:

➜ adb shell am instrument -w -r -e log true -e listener com.malinskiy.adam.junit4.android.listener.TestAnnotationProducer com.example.test/androidx.test.runner.AndroidJUnitRunner
INSTRUMENTATION_STATUS: class=com.example.TestWithWrongName
INSTRUMENTATION_STATUS: current=1
INSTRUMENTATION_STATUS: id=AndroidJUnitRunner
INSTRUMENTATION_STATUS: numtests=2
INSTRUMENTATION_STATUS: stream=
com.example.TestWithWrongName:
INSTRUMENTATION_STATUS: test=testAlwaysFailing
INSTRUMENTATION_STATUS_CODE: 1
INSTRUMENTATION_STATUS: com.malinskiy.adam.junit4.android.listener.TestAnnotationProducer.v4=[64Lorg.junit.Test(34Lexpected=class org.junit.Test$None9Ltimeout=0), 199Lkotlin.Metadata(26LbytecodeVersion=[I@e5879eb33Ldata1=[Ljava.lang.String;@793b34833Ldata2=[Ljava.lang.String;@c3375e111LextraInt=4812LextraString=6Lkind=126LmetadataVersion=[I@cd9ef0612LpackageName=)]
INSTRUMENTATION_STATUS_CODE: 2
INSTRUMENTATION_STATUS: class=com.example.TestWithWrongName
INSTRUMENTATION_STATUS: current=1
INSTRUMENTATION_STATUS: id=AndroidJUnitRunner
INSTRUMENTATION_STATUS: numtests=2
INSTRUMENTATION_STATUS: stream=.
INSTRUMENTATION_STATUS: test=testAlwaysFailing
INSTRUMENTATION_STATUS_CODE: 0
INSTRUMENTATION_STATUS: class=com.example.TestWithWrongName
INSTRUMENTATION_STATUS: current=2
INSTRUMENTATION_STATUS: id=AndroidJUnitRunner
INSTRUMENTATION_STATUS: numtests=2
INSTRUMENTATION_STATUS: stream=
INSTRUMENTATION_STATUS: test=testAlwaysFailing
INSTRUMENTATION_STATUS_CODE: 1
INSTRUMENTATION_STATUS: com.malinskiy.adam.junit4.android.listener.TestAnnotationProducer.v4=[64Lorg.junit.Test(34Lexpected=class org.junit.Test$None9Ltimeout=0), 198Lkotlin.Metadata(26LbytecodeVersion=[I@442c9c732Ldata1=[Ljava.lang.String;@d4bbf433Ldata2=[Ljava.lang.String;@e908f1d11LextraInt=4812LextraString=6Lkind=126LmetadataVersion=[I@c95f59212LpackageName=)]
INSTRUMENTATION_STATUS_CODE: 2
INSTRUMENTATION_STATUS: class=com.example.TestWithWrongName
INSTRUMENTATION_STATUS: current=2
INSTRUMENTATION_STATUS: id=AndroidJUnitRunner
INSTRUMENTATION_STATUS: numtests=2
INSTRUMENTATION_STATUS: stream=.
INSTRUMENTATION_STATUS: test=testAlwaysFailing
INSTRUMENTATION_STATUS_CODE: 0
INSTRUMENTATION_RESULT: stream=

Time: 0.304

OK (2 tests)

INSTRUMENTATION_CODE: -1

What essentially happens is marathon parses tests and receives the same test twice:

I 23:48:08.233 [main] <com.malinskiy.marathon.Marathon> Scheduling 2 tests
D 23:48:08.233 [main] <com.malinskiy.marathon.Marathon> com.example.TestWithWrongName#testAlwaysFailing, com.example.TestWithWrongName#testAlwaysFailing

Since tests are executed via a Queue this means that there is now an additional retry in the queue. For transparency when retry is scheduled it will be added to the Queue. In this case what happens is:

  1. Queue contains test X two times, i.e. [X, X]
  2. When first attempt fails, retry strategy kicks in and another retry is added, so queue is again [X, X]
  3. In total we get 3 attempts

Fix is in #919 Thanks for spending time to create a reproducer, it helped make the fixing process smooth.

Cheers!

dlykoits commented 7 months ago

Hi @Malinskiy
When are you going to release the fix and new version?