[Feature] Split shards via test timing data

ffluk3 commented 1 year ago

Playwright has some great default behavior around sharding tests across multiple workers. It would be super helpful if the shard can take in a --timing-data file similar to CircleCI's split tests logic so we can let Playwright internally split tests across the timing boundaries. This would allow Playwright to complete as fast as possible by splitting tests more optimally across multiple machines or workers.

michaelhays commented 1 year ago

This is the one thing that has made me hesitate switching from Cypress to Playwright despite the many advantages.

Cypress Cloud solves this with their Smart Orchestration (specifically, the "Load Balancing" strategy).

kp-abhishek-agrawal commented 1 year ago

This is one of the major limitation of playwright.

PsiKai commented 9 months ago

I would love to see this implemented in Playwright. It is a fairly common feature among other test platforms and tooling.

ofirpardo-artlist commented 5 months ago

I'd also love to see this feature. I thought of creating a PR with the following additions:

Add '--timing-file' flag, e.g: --timing-file="report.json"
The file will have the duration and id for each test
Based on the duration we can build sets of as close as possible duration, then assign the test groups based on the test id.

So far I tried to use the json reporter and extract this data from it which seems to work, but it's a bit messy because of the 'infinite' amount of suites that you need to go over. Not sure if I'd want to generate another smaller timing file from the json reporter, or just use the json report file and extract the data for it in the playwright runner. Also thought of creating a completely new reporter which will only include the test name, test id and duration.

Based on my testing so far, we are able to create a pretty good balance for any amount of shards. I believe more details are required, but please let me know whether this is something that could be implemented, otherwise I'd just create a custom external solution instead.

Thanks.

ofirpardo-artlist commented 5 months ago

A small scale example, current implementation:

New balance based on json report:

In most cases since the shards are running in parallel, what matters is when the last shard finishes running. Having 1 shard at 35 seconds and 1 shard at 1.5 minutes is as good as having 2 shards running for 1.5 minutes.

How the new shard balancing works:

You point to a json report file
From the report we extract the duration for each test based on ID
Based on the amount of total shards we map the tests by duration
Test groups are created based on the new mapping
If there are new tests that are not recorded in the report they will be split evenly based on current implementation

This currently works for any amount of shards.

agoldis commented 3 months ago

We have implemented something similar for Currents https://docs.currents.dev/guides/pw-parallelization/playwright-orchestration

sfrique commented 1 month ago

@ofirpardo-artlist how did you end up doing this? We would great benefit from this.

ofirpardo-artlist commented 2 weeks ago

@ofirpardo-artlist how did you end up doing this? We would great benefit from this.

@sfrique Since playwright just closed my PR without any actual explanation, I just use patch-package(https://www.npmjs.com/package/patch-package) to make the changes on playwright side: https://pastebin.com/vTy3k0uU

I can just upload the patch file if it will be easier to read perhaps.

I've created a custom reporter: https://pastebin.com/HpFxQKmZ

This gives me a json file for each passed test with the following: test id, test name, and test duration (technically this can be changed to include failed/flaky tests, but I don't think it's a good approach)

Then you can just point the timing file to that report, e.g: npx playwright test --timing-file=/results.json

Personally what I do is also merge multiple reports for a better average every few CI runs so it's always updating itself to be accurate: https://pastebin.com/6xe0uHbk

And I also use multiple shards so my merge.config.ts file looks like this:

export default {
  reporter: [
    ['./custom-reporter.ts', { outputFile: 'report.json', outputPath: 'e2e/blob-report' }],
  ],
};

Feel free to ask any questions.

microsoft / playwright

[Feature] Split shards via test timing data #17969