microsoft / playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
https://playwright.dev
Apache License 2.0
65.74k stars 3.58k forks source link

[Question] Skip on retry #17652

Closed JackSpa89 closed 1 year ago

JackSpa89 commented 1 year ago

Hello,

I have a question. Currently in our test suite we have some tests running, that are flaky in the beforeEach hook ( due to some frontend issue ). The test it self is supposed to be skipped under conditions ( that we only know after the beforeEach hook is executed. ). So the issue is, that we have the retry mechanism enabled and if the beforeEach hook fails for example in the first try, but succeeds in the 2nd try and then the test is supposed to skipped. But playwright still seems to mark the test as failed. I could put the beforeEach hook in the tests itself and this would "fix" it, but we have like 10 tests with the same preconditions and this does not seem like the best way.

So is there any way to mark the test still as skipped if first beforeEach hook fails? Or is this a bug from playwright side? Any help would be appreciated :)

Thanks and BR

aslushnikov commented 1 year ago

To illustrate this:

// a.spec.ts

import { test } from '@playwright/test';
import fs from 'fs';
import path from 'path';

test.beforeEach(() => {
  const filename = path.join(__dirname, 'flag');
  if (fs.existsSync(filename))
    return;
  fs.writeFileSync(filename, 'foo', 'utf-8');
  throw new Error('failure');
});

test('should work', async ({ }) => {
  test.skip();
});

Running this with npx playwright test yields a failed test right now; should it yield skipped instead?

We'll discuss this tomorrow with the team.

JackSpa89 commented 1 year ago

Hi @aslushnikov ,

thanks for answering. That is also a question to be discussed sure, but my point really was if the beforeEach hook fails for example the first time, but passes the second time. So to illustrate:

test.beforeEach(async () => {
    const random = Math.random();
    expect(random).toBeGreaterThan(0.5); // some flaky step
});

test('should work', async ({ }) => {
    test.skip();
});

In this case:

  1. retry beforeEach fails
  2. retry beforeEach passes -> label test as skipped

Then the test is marked as failed, and I would expect the status to be skipped as the test generally works ( just flaky )

In this case:

  1. retry beforeEach fails
  2. retry beforeEach fails
  3. retry beforeEach fails

Then the test should still be marked as failed imo, as then there is something wrong with the test.

If you got my question right the first time, never mind this post :)

BR

aslushnikov commented 1 year ago

As per discussion, this is not too important for us to prioritize, but we might change this to report the test as flaky in future.

judithhartmann commented 1 year ago

Hey, can you reconsider the priorization? It causes our pipelines to fail (they fail on a failed tests), requiring to run the entire test suite again (~30mins), because of these false negative results.

thewilkybarkid commented 1 year ago

We've just encountered this problem (without using a beforeEach).

We sometimes have to skip (e.g. https://github.com/PREreview/prereview.org/blob/43dea250e31223d05270f3a6c915e2333582f682/integration/publishing-a-prereview.spec.ts#L2275), but we do so as late as possible. Chromium has a nondeterministic rendering of part of our page so screenshot comparisons fail sporadically. In CI we use retries to mitigate it. But in the skip cases, it retries, reaches the skip annotation (i.e. successfully skips the test), but then uses the result of the original run rather than retry (so it's marked as failing, rather than as skipped and flaky). In cases where the test isn't skipped and it retries, it marks it as passing and flaky.

D4N14L commented 1 year ago

I am also seeing this behaviour on our pipelines. It would be great if these tests can be marked as "skipped" or "flaky" as was suggested above. The "skip" in the test is intentional, and our runs should not be failing, nor causing Playwright to exit with a non-zero exit code as is the current behaviour. Would love to see this fixed. @aslushnikov could this issue be re-prioritized?

D4N14L commented 1 year ago

To provide a bit more background on why this is an issue for us, I'll go a bit more in-depth on how we're encountering this.

So we've extended from the Playwright-provided test method to provide some added functionality through custom fixtures. In general, these fixtures are the entry-point for most of our tests, since we provide Page-typed fixtures which are prepared in advance by navigating to a specific URL, performing authentication, then returning the loaded page as a fixture to the test. From there, the test can then execute using a fully prepared page. On occasion however, this preparation can fail (usually due to navigation timeouts).

Additionally, we have some tests that check whether or not a feature is enabled on that page. These tests sometimes choose to skip execution of the rest of the test if their scenario is not enabled on that page.

The combination of these two factors unfortunately leads to this issue. Some flaky failure broke the test on the first attempt, but the second attempt made it all the way to the point of determining that the test didn't need to run. Retries are intended to help manage flaky failures (to the point that a test is marked flaky when it passes on a second attempt). In our scenario, marking the test as "skipped" is valid, since we only have the knowledge that the test needs to be skipped after the first page load, which can occasionally fail.

thewilkybarkid commented 1 year ago

We continue to frequently see this issue (e.g. we currently have 11 Dependabot PR open, and two have failing status checks: a flaky test was successfully retried but still marked as a failure).

hwride commented 5 months ago

This is also a problem for us, causing tests to be marked as flaky.

alectrocute commented 1 month ago

I'm still having this issue and it's driving me up a wall. Anybody figured it out?

cyrus-afshari commented 1 month ago

Kind of annoyed that, with my setup (CI runs: 2 retries allowed), if the first 2 attempts fail, and the 3rd one meets 'skip' criteria, then the test is skipped, BUT the test result in the report is "Flaky". To me, a flaky result means it passed after at least 1 retry. I don't see how failed + skipped should result in flaky. It should result in "skipped", because ultimately it was skipped lol.