[QUESTION] How do run beforeAll just once in fully-parallel mode?

microsoft / playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

https://playwright.dev

Apache License 2.0

66.36k stars 3.63k forks source link

[QUESTION] How do run beforeAll just once in fully-parallel mode? #22520

Open Aroot42 opened 1 year ago

Aroot42 commented 1 year ago

In some test file I need to do some preconditions, e.g. create catalogue product or delete some entities in DB. If I place needed code in before/after all fixtures it will be do for each worker, but I need only one time. How can I do these actions? Maybe exists possibility like ClassInitialize and ClassCleanup in C#/MSTest?

dgozman commented 1 year ago

@Aroot42 If you need to do something just once for the entire test run, look at global setup and teardown. Let me know if that helps.

Aroot42 commented 1 year ago

@dgozman I don't think it's a good scenario if I have more than 30 test files, and half of them require some preconditions before running all the tests in the test file, which takes around 1-2 minutes each. This adds up to a total of around 30 minutes of global setup time, even if I only want to run one or two specific files or test suites. I believe this is a very bad idea.

dgozman commented 1 year ago

@Aroot42 I guess I don't really understand what you'd like to achieve.

There are three ways to have pre/post processing:

global setup/teardown for the whole test run;
worker fixtures for all tests in a single worker;
beforeAll/afterAll for all tests in a single file/describe.

If your usecase does not fit any of the above, please be more specific and provide an example code that we can run locally to understand the issue.

Aroot42 commented 1 year ago

@Aroot42 I guess I don't really understand what you'd like to achieve.

There are three ways to have pre/post processing:

global setup/teardown for the whole test run;

worker fixtures for all tests in a single worker;

beforeAll/afterAll for all tests in a single file/describe.

If your usecase does not fit any of the above, please be more specific and provide an example code that we can run locally to understand the issue.

ok, here's an example. I have 2 test files, for example. in the first, for all tests it is necessary to prepare 3 product catalog. all tests use these products. if I put the creation of products in the beforeall fixture, then products will be created for each worker - 3 products number of workers (35=15), and I want the code to be executed only 1 time and only 3 products were created. The second test file contains a cleaning script after all the tests are completed. if you put it in the afterall fixture, the script will be executed after the completion of each worker's work and the first worker will execute this script and all the tests of the other workers that have not yet finished will fall. if you put both of these actions in a global setup, then even only when you run the second test file, the product creation code will be splashed.

dgozman commented 1 year ago

@Aroot42 Thank you for the explanation, I understand the issue now. The only solution I can think of right now would be:

Put this file into a separate project "special";
Create a dependency project "setup for special" that would prepare product catalogs for it;
Removing catalogs at the end would still be problematic. I'd recommend to instead remove these catalogs during preparation phase if there are any left from previous runs.

See also #19889 for a similar scenario.

kangmingX commented 4 months ago

I don't understand why you don't support the feature that enables one worker working in beforeAll. Once it's done, then multiple workers work for test cases. I understand you provide setup and teardown for us and there are workarounds provided by you

global setup/teardown for the whole test run;
worker fixtures for all tests in a single worker;
beforeAll/afterAll for all tests in a single file/describe.

However, sometimes it's more reasonable to set up test in a way making more sense on product features/concepts

t-mish commented 3 months ago

Upvote

marcela-meirelles commented 2 months ago

Upvote

rrezart-bosch commented 1 month ago

Upvote

vitalets commented 1 month ago

I also agree that calling beforeAll just once is a reasonable requirement in some cases. I've found a very good example in another issue https://github.com/microsoft/playwright/issues/28201#issuecomment-2022937033. It would be great to have an option for that, so developer can mark specific hooks with once: true. For example:

test.beforeAll({ title: 'setup', once: true }, async () => {
  // ...will run only once
});

Workaround I've made some investigation on the possible workaround. The main problem is cross-worker communication - workers are not aware of each other, so one worker does not know, that beforeAll was already executed in another.

I've tried to setup such communication via lock-files. When beforeAll is called for the first time, a file is created. Other workers check that file and don't run the hook again. In my tests the approach worked (OSX, 3 workers).

Implementation:

```ts // isFirstRun.ts import fs from 'fs'; import path from 'path'; import os from 'os'; import crypto from 'crypto'; const FIRST_RUN_LOCKS_DIR = path.join(os.tmpdir(), 'first-run-locks'); /** * Returns true on the first invocation in the particular line in the code. */ export async function isFirstRun() { // tiny random delay to avoid concurrency issues during the test-run start await new Promise(resolve => setTimeout(resolve, 500 * Math.random())); // get the place of invocation by stack trace const stack = new Error().stack || ''; const hash = crypto.createHash('md5').update(stack).digest('hex'); // generate unique lock dir path const lockDir = path.join(FIRST_RUN_LOCKS_DIR, hash); if (fs.existsSync(lockDir)) return false; fs.mkdirSync(lockDir, { recursive: true }); console.log(`Lock created: ${lockDir}`); return true; } /** * This function should be called in Playwright teardown to cleanup all locks. */ export default function cleanup() { console.log(`Cleanup...`); fs.rmSync(FIRST_RUN_LOCKS_DIR, { force: true, recursive: true }); } ```

Usage:

// test.ts
import { isFirstRun } from './isFirstRun';

test.beforeAll(async () => {
  if (!await isFirstRun()) return;
  console.log("This should be executed only once during test run");
});

In the Playwright config set global teardown to cleanup all locks:

// playwright.config.ts
export default defineConfig({
  globalTeardown: require.resolve('./tests/isFirstRun'),
  // ...
});

Test: I've created a test file with beforeAll like above and 6 tests (2 passing, 4 failing). Executed on 3 workers in fullyParallel mode.

$ npx playwright test

Running 6 tests using 3 workers
Worker: 1
Worker: 0
Worker: 2
Lock created: /var/folders/8y/02_80cxs14s63wtvj3fqtkd00000gn/T/first-run-locks/49f2b0d069e1d44668ee9c2cd3b79090
beforeAll executed!
F
Worker: 3
·F
Worker: 4
F·F
Cleanup...

beforeAll was executed only once. Hope it would be helpful.

vitalets commented 2 weeks ago

I've improved the workaround from my previous message to run beforeAll / afterAll only once. Instead of using lock-files, I've setup cross-worker communication. It is faster and does not need lock-files cleanup.

In the new solution, each worker sends a message to the host process, asking whether this line of code was executed or not. On the first request, host process marks the line as executed and responds with isFirstRun: true. On all other messages with that line from any worker it responds isFirstRun: false.

Implementation of isFirstRun.ts:

```ts /** * A module to run some code in worker once, usually in beforeAll / afterAll hooks. * Implementation: * 1. hook into Playwright worker creation to intercept messages from worker * 2. on every invocation of isFirstRun() workers sends message to host with stack trace * 3. host checks if it's the first run of the particular line in the code and sends back the result */ import path from 'node:path'; import fs from 'node:fs'; const invocations = new Set(); const callbacks = new Map void>(); let lastMessageId = 0; if (process.env.TEST_WORKER_INDEX) { registerWorker(); } else { registerHost(); } type CheckFirstRunMessage = { method: '__check_first_run__', params: { id: number, stack: string, isFirstRun?: boolean, } } /** * Returns true on the first invocation of the particular line in the worker code. */ export async function isFirstRun() { // Get invocation place by stack trace const stack = new Error().stack || ''; return new Promise(resolve => { const id = ++lastMessageId; callbacks.set(id, resolve); process.send!(buildMessage(id, stack)); }); } function registerHost() { hookWorkerProcess((workerProcess) => { workerProcess.on('message', (message: unknown) => { if (isCheckFirstRunMessage(message)) { const { id, stack } = message.params; const isFirstRun = !invocations.has(stack); if (isFirstRun) invocations.add(stack); workerProcess.send!(buildMessage(id, stack, isFirstRun)); } }); }); } function registerWorker() { process.on('message', (message: unknown) => { if (isCheckFirstRunMessage(message)) { const { id, isFirstRun } = message.params; callbacks.get(id)?.(Boolean(isFirstRun)); callbacks.delete(id); } }); } // till pw 1.37 node_modules/@playwright/test/lib/runner/workerHost.js // since pw 1.38 node_modules/playwright/lib/runner/workerHost.js function getWorkerHostPath() { let pwPath = require.resolve('@playwright/test'); let workerHostPath = `${path.dirname(pwPath)}/lib/runner/workerHost.js`; if (fs.existsSync(workerHostPath)) return workerHostPath; pwPath = require.resolve('playwright'); return `${path.dirname(pwPath)}/lib/runner/workerHost.js`; } function hookWorkerProcess(fn: (workerProcess: NodeJS.Process) => void) { const { WorkerHost } = require(getWorkerHostPath()); const origStart = WorkerHost.prototype.start; WorkerHost.prototype.start = async function (...args: any[]) { const result = await origStart.call(this, ...args); fn(this.process); return result; }; } function buildMessage(id: number, stack: string, isFirstRun?: boolean): CheckFirstRunMessage { return { method: '__check_first_run__', params: { id, stack, isFirstRun } }; } function isCheckFirstRunMessage(message: unknown): message is CheckFirstRunMessage { return (message as CheckFirstRunMessage)?.method === '__check_first_run__'; } ```

Usage:

// test.ts
import { isFirstRun } from './isFirstRun';

test.beforeAll(async () => {
  if (!await isFirstRun()) return;
  console.log("This will be executed only once during test run");
});

In the playwright.config.ts import isFirstRun.ts to setup host part:

import `./tests/isFirstRun`; // <- registers host part for isFirstRun checks

export default defineConfig({
  // ...
});

DeepakSahu-Engineer commented 1 week ago

I also faced same problem and this needs to be fixed .. Problem Statement: In Playwright, the beforeAll hook is expected to run only once before all tests in a test suite, providing a way to perform setup tasks that need to happen once for the entire suite, regardless of the number of workers. However, when using multiple workers (e.g., workers: 5), the beforeAll hook is executed once per worker, which contradicts its intended purpose and creates several issues in scenarios where shared setup is crucial.

This behavior poses significant limitations for certain use cases:

Redundant API Calls: If the beforeAll setup involves making API calls (e.g., to create test data), running this logic multiple times across workers can result in redundant or conflicting calls. Unique Test Data Creation: If the setup creates unique data (e.g., generating resource IDs or categories), running it across workers leads to multiple instances of the same data being created, potentially causing failures or unnecessary complexity. Wasted Resources: Unnecessary API calls and resource creation increase the chances of race conditions, performance issues, or failures due to repeated actions that were meant to happen only once. Current Behavior: In Playwright:

beforeAll: Runs once per worker when multiple workers are used for parallel execution. This results in redundant executions of the setup logic. beforeEach: Runs before each individual test across all workers, as expected, ensuring that every test starts with its required preconditions. Why This Behavior is Problematic: Breaks the Single Setup Assumption: Developers expect beforeAll to be called only once, no matter how many workers are running. The current behavior breaks this assumption.

Test Data Collision: In cases where a setup function generates unique test data (e.g., creating a category name via API), multiple executions of beforeAll can result in conflicts or API failures if the same data cannot be created multiple times.

Inefficient Resource Utilization: Repeated calls to external services (e.g., databases, APIs) unnecessarily waste system resources, leading to performance degradation and increasing the chances of hitting rate limits or encountering failures.

Example Scenario: javascript Copy code test.describe('Link record page Scenarios with forms submission:', async () => {

test.beforeAll("Setup test data", async () => {
    // Calls API to create test data
    const testDataDesign = await TestDataDesign.createTestDataDesignInstance();
    await testDataDesign.prerequisiteAPICall(testDataDesignModel);

    // Generates unique test data for the entire suite
    console.log("Test data created: " + testDataDesignModel);
});

test('Test 1', async () => {
    // Use the test data created in beforeAll
});

test('Test 2', async () => {
    // Use the same test data created in beforeAll
});

}); In the example above:

If 5 workers are used, the beforeAll hook is executed 5 times, resulting in 5 different sets of test data being created, although only one set is needed for all tests. This leads to: API failures if the API doesn't allow duplicate data. Different test data across tests, which defeats the purpose of having a shared setup. Proposed Solution: Modify the behavior of beforeAll to ensure it runs only once, regardless of the number of workers.

beforeAll should run once per test suite, whether the test suite is executed with one or multiple workers. This can be achieved by creating a synchronization mechanism within Playwright's test runner to ensure that when multiple workers are used, only the first worker runs the beforeAll logic, while the rest of the workers wait for the setup to complete. Key Advantages of the Proposed Solution: Consistent Setup Across Workers: Ensures that all workers use the same setup data, preventing inconsistencies or conflicts.

Efficient Resource Usage: Avoids redundant API calls or resource creation, leading to more efficient test execution, especially in environments where external systems or APIs are used for test data generation.

Simplified Test Design: Developers can confidently use beforeAll for test suite-wide setup without worrying about parallel worker behavior, making it easier to manage shared data.

Counterargument and Resolution: Some tests might need separate setup per worker: In cases where distinct setups per worker are needed, the beforeEach hook can be utilized for isolated setups, or Playwright could introduce an alternative to beforeAll, like beforeAllPerWorker. Conclusion: To enhance flexibility and consistency in Playwright, it is critical that the beforeAll hook behaves as expected and runs only once for the entire test suite, regardless of the number of workers. This will align with common test automation principles and prevent unnecessary issues with redundant API calls, test data collisions, and wasted resources.

siurbele420 commented 1 day ago

I've improved the workaround from my previous message to run beforeAll / afterAll only once. Instead of using lock-files, I've setup cross-worker communication. It is faster and does not need lock-files cleanup.

In the new solution, each worker sends a message to the host process, asking whether this line of code was executed or not. On the first request, host process marks the line as executed and responds with isFirstRun: true. On all other messages with that line from any worker it responds isFirstRun: false.

Implementation of isFirstRun.ts: /**

A module to run some code in worker once, usually in beforeAll / afterAll hooks.

Implementation:

hook into Playwright worker creation to intercept messages from worker

on every invocation of isFirstRun() workers sends message to host with stack trace

host checks if it's the first run of the particular line in the code and sends back the result */ import path from 'node:path'; import fs from 'node:fs';

const invocations = new Set(); const callbacks = new Map<number, (isFirstRun: boolean) => void>();

let lastMessageId = 0;

if (process.env.TEST_WORKER_INDEX) { registerWorker(); } else { registerHost(); }

type CheckFirstRunMessage = { method: 'check_first_run', params: { id: number, stack: string, isFirstRun?: boolean, } }

/**

Returns true on the first invocation of the particular line in the worker code. */ export async function isFirstRun() { // Get invocation place by stack trace const stack = new Error().stack || ''; return new Promise(resolve => { const id = ++lastMessageId; callbacks.set(id, resolve); process.send!(buildMessage(id, stack)); }); }

function registerHost() { hookWorkerProcess((workerProcess) => { workerProcess.on('message', (message: unknown) => { if (isCheckFirstRunMessage(message)) { const { id, stack } = message.params; const isFirstRun = !invocations.has(stack); if (isFirstRun) invocations.add(stack); workerProcess.send!(buildMessage(id, stack, isFirstRun)); } }); }); }

function registerWorker() { process.on('message', (message: unknown) => { if (isCheckFirstRunMessage(message)) { const { id, isFirstRun } = message.params; callbacks.get(id)?.(Boolean(isFirstRun)); callbacks.delete(id); } }); }

// till pw 1.37 node_modules/@playwright/test/lib/runner/workerHost.js // since pw 1.38 node_modules/playwright/lib/runner/workerHost.js function getWorkerHostPath() { let pwPath = require.resolve('@playwright/test'); let workerHostPath = ${path.dirname(pwPath)}/lib/runner/workerHost.js; if (fs.existsSync(workerHostPath)) return workerHostPath; pwPath = require.resolve('playwright'); return ${path.dirname(pwPath)}/lib/runner/workerHost.js; }

function hookWorkerProcess(fn: (workerProcess: NodeJS.Process) => void) { const { WorkerHost } = require(getWorkerHostPath()); const origStart = WorkerHost.prototype.start; WorkerHost.prototype.start = async function (...args: any[]) { const result = await origStart.call(this, ...args); fn(this.process); return result; }; }

function buildMessage(id: number, stack: string, isFirstRun?: boolean): CheckFirstRunMessage { return { method: 'check_first_run', params: { id, stack, isFirstRun } }; }

function isCheckFirstRunMessage(message: unknown): message is CheckFirstRunMessage { return (message as CheckFirstRunMessage)?.method === 'check_first_run'; } Usage:

// test.ts import { isFirstRun } from './isFirstRun';

test.beforeAll(async () => { if (!await isFirstRun()) return; console.log("This will be executed only once during test run"); }); In the playwright.config.ts import isFirstRun.ts to setup host part:

import ./tests/isFirstRun; // <- registers host part for isFirstRun checks

export default defineConfig({ // ... });

@vitalets Do you have a version that is ES6 compatible? Tried using this on node.js typescript and it does not run and gives error. Also would using this solution share the variable data (that was created during beforeAll) across workers or should I save it in some file to be able to read it with each worker?

vitalets commented 1 day ago

@vitalets Do you have a version that is ES6 compatible? Tried using this on node.js typescript and it does not run and gives error.

Could you clarify, what do you mean under ES6 compatible? Maybe ESM compatible? Error message would be very helpful.

Also would using this solution share the variable data (that was created during beforeAll) across workers or should I save it in some file to be able to read it with each worker?

This is an interesting point. Currently, data is not shared. What is your use-case? I'm thinking about the following: create a user in beforeAll hook (once) and share created userId across all workers for testing. If I use the current approach, variables will be populated only once in some worker, that runs first:

let testUserId = '';

test.beforeAll(async () => {
  if (!await isFirstRun()) return;
  testUserId = ... // heavy operation to create user and return userId
});

test('check auth', async () => { 
  // ... actions using testUserId 
});

To support sharing of data, we can have something like cross-worker memoization. So hypothetical code is:

let testUserId = '';

test.beforeAll(async () => {
  testUserId = await crossWorkerMemo(async () => {
     ... // heavy operation to create user and return userId, runs only once
  });
});

test('check auth', async () => { 
  // ... actions using testUserId 
});

I will think about it, technically it is possible. The only requirement - shared data should be serializable, to be able to pass it between workers. I suppose it would usually be some json structure, so not a strict limitation.

siurbele420 commented 1 day ago

@vitalets Oh sorry, I meant ESM. Taking your solution without editing I get the error :

ReferenceError: require is not defined in ES module scope, you can use import instead
    at getWorkerHostPath

Tried adopting it by editing 'require' in hookWorkerProcess and getWorkerHostPath but had no luck (Skill issue probably :D) :

/**
 * A module to run some code in worker once, usually in beforeAll / afterAll hooks.
 * Implementation:
 * 1. hook into Playwright worker creation to intercept messages from worker
 * 2. on every invocation of isFirstRun() workers sends message to host with stack trace
 * 3. host checks if it's the first run of the particular line in the code and sends back the result
 */
import fs from 'node:fs'
import path from 'node:path'

const invocations = new Set<string>()
const callbacks = new Map<number, (isFirstRun: boolean) => void>()

let lastMessageId = 0

if (process.env.TEST_WORKER_INDEX) {
  registerWorker()
} else {
  registerHost()
}

type CheckFirstRunMessage = {
  method: '__check_first_run__'
  params: {
    id: number
    stack: string
    isFirstRun?: boolean
  }
}

/**
 * Returns true on the first invocation of the particular line in the worker code.
 */
export async function isFirstRun() {
  // Get invocation place by stack trace
  const stack = new Error().stack || ''

  return new Promise<boolean>((resolve) => {
    const id = ++lastMessageId

    callbacks.set(id, resolve)
    process.send!(buildMessage(id, stack))
  })
}

function registerHost() {
  hookWorkerProcess((workerProcess) => {
    workerProcess.on('message', (message: unknown) => {
      if (isCheckFirstRunMessage(message)) {
        const { id, stack } = message.params
        const isFirstRun = !invocations.has(stack)

        if (isFirstRun) invocations.add(stack)

        workerProcess.send!(buildMessage(id, stack, isFirstRun))
      }
    })
  })
}

function registerWorker() {
  process.on('message', (message: unknown) => {
    if (isCheckFirstRunMessage(message)) {
      const { id, isFirstRun } = message.params

      callbacks.get(id)?.(Boolean(isFirstRun))
      callbacks.delete(id)
    }
  })
}

// till pw 1.37 node_modules/@playwright/test/lib/runner/workerHost.js
// since pw 1.38 node_modules/playwright/lib/runner/workerHost.js
export function getWorkerHostPath() {
  let pwPath = import.meta.resolve('@playwright/test')
  const workerHostPath = `${path.dirname(pwPath)}/lib/runner/workerHost.js`

  if (fs.existsSync(workerHostPath)) return workerHostPath

  pwPath = import.meta.resolve('playwright')

  return `${path.dirname(pwPath)}/lib/runner/workerHost.js`
}

function hookWorkerProcess(fn: (workerProcess: NodeJS.Process) => void) {
  const WorkerHost = getWorkerHostPath()
  const origStart = WorkerHost.prototype.start

  // eslint-disable-next-line @typescript-eslint/no-explicit-any
  WorkerHost.prototype.start = async function (...args: any[]) {
    const result = await origStart.call(this, ...args)

    fn(this.process)

    return result
  }
}

function buildMessage(id: number, stack: string, isFirstRun?: boolean): CheckFirstRunMessage {
  return {
    method: '__check_first_run__',
    params: { id, stack, isFirstRun },
  }
}

function isCheckFirstRunMessage(message: unknown): message is CheckFirstRunMessage {
  return (message as CheckFirstRunMessage)?.method === '__check_first_run__'
}

Then the error stating that the start is undefined:

TypeError: Cannot read properties of undefined (reading 'start')
    at hookWorkerProcess (src/e2e/test-utils/isFirstRun.ts:87:42)
    at registerHost (src/e2e/test-utils/isFirstRun.ts:47:3)
    at src/e2e/test-utils/isFirstRun.ts:19:3
    at ModuleJob.run (node:internal/modules/esm/module_job:262:25)
    at onImport.tracePromise.__proto__ (node:internal/modules/esm/loader:485:26)
    at requireOrImport (src/node_modules/.pnpm/playwright@1.48.1/node_modules/playwright/lib/transform/transform.js:230:24)
    at loadUserConfig (src/node_modules/.pnpm/playwright@1.48.1/node_modules/playwright/lib/common/configLoader.js:94:46)
    at loadConfig (src/node_modules/.pnpm/playwright@1.48.1/node_modules/playwright/lib/common/configLoader.js:105:22)
    at loadConfigFromFileRestartIfNeeded (src/node_modules/.pnpm/playwright@1.48.1/node_modules/playwright/lib/common/configLoader.js:273:10)
    at runTests (src/node_modules/.pnpm/playwright@1.48.1/node_modules/playwright/lib/program.js:199:18)
    at t.<anonymous> (src/node_modules/.pnpm/playwright@1.48.1/node_modules/playwright/lib/program.js:54:7)

About the use case of sharing the data across workers. I normally create unique worker to a specific test that has some things set up it also brings various other user data with it that is a class instance, but it can easily become a json if you come up with the solution. Otherwise I would need to save some json file and read it by each worker.