Monitor flaky tests - Githubissues

MrGreenTea commented 7 months ago

With this port the playwright tests seem to have gotten somewhat flaky. We have added the retries: 3 config to playwright.config.ts to hide some of the flakiness. One goal of this issue would be to remove that config and make the tests stable again.

We should make sure to monitor this as flaky tests suck. See this action for example: https://github.com/WithSecureOpenSource/flaky-test-ci

MrGreenTea commented 7 months ago

We could get a history starting by saving some JUnit reports from playwright in the CI.

MrGreenTea commented 7 months ago

On the branch is a small script to download all previous artifacts. It needs requests to be installed into your virtual environment.

python download_artifacts.py https://api.github.com/repos/YoungVision-eV/homepage-astro/actions/artifacts $GITHUB_TOKEN playwright-results ($BRANCH_NAME or ALL)

You have to generate a personal access token first and use it. The branch name is optional and defaults to main.

After that you can pip install flaky-tests-detection and run the flakiness-report.sh script to generate the top 5 flaky tests.

The script was copied and adapted from here: https://github.com/WithSecureOpenSource/flaky-test-ci/blob/main/download_artifacts.py

MrGreenTea commented 7 months ago

Perhaps using another solution for visual comparisons helps? Options like percy and lost-pixel should be evaluated.

As I understand it percy for example does not create a screenshot but sends the whole DOM and so could be more stable as the visual comparison happens in their cloud.

MrGreenTea commented 6 months ago

I would like some kind of continous monitoring. There is no flakiness in the last 7 days apparently, but with a window of 14 days there is:

❯ flaky --junit-files=reports --grouping-option=days --window-size=1 --window-count=14 --top-n=5

Top 5 flaky tests based on latest window exponential weighted moving average fliprate score
index.spec.ts::index page screenshot --- score: 0.06917
about-us.spec.ts::About Us page screenshot --- score: 0.04001

MrGreenTea commented 6 months ago

What's the goal of this issue? When can we close it?

MrGreenTea commented 6 months ago

Current status with 14 day history and 3 day windows

> flaky --junit-files=reports --grouping-option=runs --window-size=3 --window-count=14 --top-n=5

No flaky tests.

MrGreenTea commented 6 months ago

The "real" flakiness is pretty much masked by our usage of the retries option.

See this issue: https://github.com/microsoft/playwright/issues/29446

Han2-Ro commented 5 months ago

https://github.com/YoungVision-eV/website/blob/f3504d1738d453f7e24f3133c531e7f71cd3c0e9/tests/support-us.spec.ts#L8 I think this line is an often cause for retries. Maybe using forceLoadImages instead could be a solution?

MrGreenTea commented 5 months ago

https://github.com/YoungVision-eV/website/blob/f3504d1738d453f7e24f3133c531e7f71cd3c0e9/tests/support-us.spec.ts#L8

I think this line is an often cause for retries. Maybe using forceLoadImages instead could be a solution?

That could be. When I look at the failures it seems that the found element isn't attached to the DOM (yet). The forceLoadImages can have some issues as well because it uses Locator.all() which can cause some issues in rare cases.

MrGreenTea commented 2 months ago

https://github.com/mojoaxel/awesome-regression-testing

YoungVision-eV / website

Monitor flaky tests #21