Flank / flank

:speedboat: Massively parallel Android and iOS test runner for Firebase Test Lab
https://firebase.community/
Apache License 2.0
682 stars 119 forks source link

Flakes attribute never set correctly in FullJunitReport.xml #2504

Closed AaronMT closed 1 week ago

AaronMT commented 5 months ago
<testsuite name="Pixel2.arm-30-en_US-portrait" tests="462" failures="1" flakes="0" errors="0" skipped="0" time="10277.484" timestamp="2024-06-23T06:02:53" hostname="localhost">
    <testcase name="verifyButtonTest" classname="org.app.verifyButtonTestTest" time="0.222" flaky="true">
     ...
</testsuite>

Shouldn't flakes attribute be 1 in this case? This is with v23.10.1.

Our Flank config specifies num-flaky-test-attempts: 1 with full-junit-result: true under the Flank section.

SelaseKay commented 1 month ago

Hi @AaronMT , thanks for the report. I'm able to reproduce this issue. I'm looking into it.

SelaseKay commented 1 month ago

Hi @AaronMT , I was able to reproduce this once but for some weird reason, I'm not able to reproduce it anymore. Are you still experiencing this issue?

AaronMT commented 1 month ago

Yes, I still see this issue on 23.10.1. Here's an example report from our CI from yesterday.

<testcase name="verifyReaderModeControlsTest" classname="org.mozilla.fenix.ui.ReaderViewTest" time="0.275" flaky="true">
<failure>
java.security.ProviderException: Keystore operation failed at ...
</failure>
<webLink>
...
</webLink>
</testcase>

Where flaky is set to true, but the top level testsuite has the attributeflakes set to 0

<testsuite name="Pixel2.arm-30-en_US-portrait" tests="486" failures="0" flakes="0" errors="0" skipped="0" time="10298.759" timestamp="2024-10-14T21:30:46" hostname="localhost">
SelaseKay commented 1 month ago

Are you able to reproduce this locally?

SelaseKay commented 1 month ago

Can you provide a sample app and test apk?

AaronMT commented 1 month ago

As Flank (via Test Lab) generates these artifacts, I'm not sure what you mean by reproducing locally.

Here's a recent (arm64v8a debug build from our CI) and no-arch test APK

https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/M4IjtgpLTWuZaUg43Q0ByA/artifacts/public/build/target.arm64-v8a.apk

https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/TvdzLJF3Tiij13O8gqe9dA/artifacts/public/build/target.noarch.apk

SelaseKay commented 1 month ago

I apologise for the confusion. By reproducing locally, I meant executing Flank on your own machine to initiate Test Lab runs, rather than in CI.

SelaseKay commented 1 month ago

Kindly provide the device specifications you have in your flank.yml

SelaseKay commented 1 month ago

Seems to work fine locally:

<?xml version='1.0' encoding='UTF-8' ?>
<testsuites>
  <testsuite name="Pixel2.arm-30-en_US-portrait" tests="45" failures="11" flakes="1" errors="0" skipped="0" time="575.163" timestamp="2024-10-17T20:08:37" hostname="localhost">
    <testcase name="testExperimentUnenrolledViaSecretMenu" classname="org.mozilla.fenix.experimentintegration.GenericExperimentIntegrationTest" time="50.552">
      <failure>androidx.test.espresso.NoActivityResumedException: No activities in stage RESUMED. Did you forget to launch the activity. (test.getActivity() or similar)?
    at dalvik.system.VMStack.getThreadStackTrace(Native Method)
    at java.lang.Thread.getStackTrace(Thread.java:1736)
      ......
    </testcase>
    <testcase name="settingsTest" classname="org.mozilla.fenix.screenshots.ComposeMenuScreenShotTest" time="10.278" flaky="true">
      <failure>FAILED
    </failure>
......
    </testcase>
  </testsuite>
</testsuites>
AaronMT commented 1 month ago
gcloud:
  results-bucket: ...
  record-video: true
  timeout: 15m
  async: false
  num-flaky-test-attempts: 1

  app: /app/path
  test: /test/path

  auto-google-login: false
  use-orchestrator: true
  environment-variables:
    clearPackageData: true
  performance-metrics: true

  test-targets:
    - notPackage org.mozilla.fenix.screenshots
    - notPackage org.mozilla.fenix.syncintegration
    - notPackage org.mozilla.fenix.experimentintegration

  device:
    - model: Pixel2.arm
      version: 30
      locale: en_US

flank:
  project: ...
  max-test-shards: 100
  num-test-runs: 1
  output-style: compact
  full-junit-result: true

Does this have anything to do with full-junit-result?

SelaseKay commented 1 month ago

Do you experience this issue when you run flank in your local environment with the above configuration?

AaronMT commented 1 month ago

Hello again, yes I was able to reproduce with a local Flank run call to the same configuration above.

I had a flaky test in my run below with the following top level test suite

<testsuite name="MediumPhone.arm-34-en_US-portrait" tests="498" failures="0" flakes="0" errors="0" skipped="0" time="11272.089" timestamp="2024-10-23T09:46:12" hostname="localhost">
<testcase name="verifyCFRAfterBlockingTheCookieBanner" classname="org.mozilla.fenix.ui.CookieBannerBlockerTest" time="38.066" flaky="true">
<failure>
java.lang.AssertionError: UiSelector[CONTAINS_TEXT=Less distractions, less cookies tracking you on this site.] does not exist at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.assertTrue(Assert.java:42) at org.mozilla.fenix.helpers.MatcherHelper.assertUIObjectExists(MatcherHelper.kt:100) at org.mozilla.fenix.ui.robots.BrowserRobot.verifyCookieBannerBlockerCFRExists(BrowserRobot.kt:833) at org.mozilla.fenix.ui.CookieBannerBlockerTest$verifyCFRAfterBlockingTheCookieBanner$1$4.invoke(CookieBannerBlockerTest.kt:60) at org.mozilla.fenix.ui.CookieBannerBlockerTest$verifyCFRAfterBlockingTheCookieBanner$1$4.invoke(CookieBannerBlockerTest.kt:58) at org.mozilla.fenix.ui.robots.NavigationToolbarRobot$Transition.enterURLAndEnterToBrowser(NavigationToolbarRobot.kt:226) at org.mozilla.fenix.ui.CookieBannerBlockerTest$verifyCFRAfterBlockingTheCookieBanner$1.invoke(CookieBannerBlockerTest.kt:58) at org.mozilla.fenix.ui.CookieBannerBlockerTest$verifyCFRAfterBlockingTheCookieBanner$1.invoke(CookieBannerBlockerTest.kt:45) at org.mozilla.fenix.helpers.AppAndSystemHelper.runWithCondition(AppAndSystemHelper.kt:686) at org.mozilla.fenix.ui.CookieBannerBlockerTest.verifyCFRAfterBlockingTheCookieBanner(CookieBannerBlockerTest.kt:45)
</failure>
<webLink>
https://console.firebase.google.com/project/moz-fenix/testlab/histories/bh.66b7091e15d53d45/matrices/7524624672444898348/executions/bs.a2c09c06d295e45c/testcases/2
</webLink>
</testcase>
SelaseKay commented 1 month ago

Hi @AaronMT, after some investigation, I've found that the issue is related to the max-test-shards property. If you omit this property, allowing it to use it's default value, the behavior is as expected (with flakes reflecting the correct value).

AaronMT commented 3 weeks ago

Thanks for investigating. Out of curiosity are you planning a new release?