Make running Playwright to update snapshots easier

Problem

Playwright is very heavy to run. Any time we change versions of Playwright, to run the visual regression tests locally (and update snapshots accurately), you're forced to download a rather large Playwright-specific docker image (which incidentally also tends to come from a slow host, depending on where in the world you are). This makes it difficult for some contributors to run, especially if your development computer isn't a huge multi-core powerhouse.

Description

Originally I thought about the alternative approach I've shared below (opening a new branch and PR), but that would require maintaining a few more new triggers and workflows to clean up branches and such. Instead, using git diff --binary might be a good approach for creating a shareable blob that could be applied by any contributor, and not require any special permissions or use any more resources than we currently use already.

To do that, update the existing Playwright CI jobs to run Playwright with -u so that snapshots are enabled. Then, instead of only relying on a non-zero exit code, check for a non-zero exit code and whether snapshots changed. If either happened, consider the run a failure, and trigger the existing failure job.

However, instead of uploading the test results zip like we do now, we can instead upload a gzipped git patch blob:

git add .
git diff --staged --binary | gzip -f > /tmp/snapshot_diff.patch.gz

Upload /tmp/snapshot_diff.patch.gz as an artefact of the workflow run, and link to it directly from the comment. Add instructions for how to use it:

gunzip -k -c snapshot_diff.patch.gz | git apply -

gunzip looks to be available on macOS (and is usually available on most Linux distros), but we can skip compression if we don't think it's that important.

This would improve the situation for contributors and prevent needing to run the heavy visual regression tests locally if you're unable to for any reason.

Alternatives

Let's leverage CI more heavily for running Playwright. We need to make sure we use this responsibly (to prevent exhausting our free CI resources), but we could add a new GitHub Workflow accessible to core contributors (those with commit access to the monorepo) that runs the Playwright test suite with -u enabled (so snapshots get updated), and opens a PR targeting the PR that caused changes.

We could consolidate this into our existing playwright run, actually, by changing the playwright runs to always pass -u in CI, and then instead of relying on just a non-zero exit code from the test runs, check for a non-zero exit code and see if snapshots updated. If either happened, then consider it a failed run. We can leave the comments we already leave, but if there are snapshot changes, then also open (or update) a secondary branch targeting the original PR's branch with a single commit containing the updated snapshots. The flow should go:

Run Playwright tests with `-u`

If snapshot changes:
    Create a local branch `${ PR branch }__playwright_ci_snapshots`
    Commit the snapshot changes `git add . && git commit -m 'Update snapshots'`
    Force push the branch
    If a PR does not exist yet for this new branch:
        Open the PR and save the link

If snapshot changes or non-zero exit code:
    Consider it a failure
    Leave the failed test comment
    If snapshot changes:
        include the link to the PR with the snapshot updates

An additional step needs to be added for when a PR is merged, to check for branches and PRs of the pattern origin/${ PR branch }__playwright_ci_snapshots and close those PRs & delete the branches on origin.

Additional context

Coming out of thinking about https://github.com/WordPress/openverse/issues/4514 and @madewithkode's experience with the playwright tests recently.

Hi @sarayourfriend your proposed solution makes a lot of sense and seems like an interesting problem that I'd love to tackle(assuming no one else gets to it before I'm done with https://github.com/WordPress/openverse/issues/4514).

I'm seeking a bit more clarity regarding the step to apply the generated patch after it's downloaded. From what I observed from the current structure of the artefacts uploaded after a failure(after they're unzipped), each folder of the diff has three files in the form - abc-file-expected.png, abc-file-diff.png and lastly abc-file-actual.png, the current process to resolve using these files(after confirming that diffs make sense of course) is to first of all:

rename abc-file-actual.png to abc-file-linux.png as this is what the corresponding file would be named in the codebase and then
replace existing abc-file-linux.png in your working branch withe the renamed one.

My question is, will the proposed change do away with these extra manual process and just have gunzip -k -c snapshot_diff.patch.gz | git apply - handle everything perfectly. This of course would mean that the contents of snapshot_diff.patch.gz when unzipped would exactly match the files to be replaced and require no further manual renames/processing. I hope this makes sense.

P.S:

git add .
git diff --staged --binary | gzip -f > /tmp/snapshot_diff.patch.gz

I am aware that the above commands in theory, should definitely give the expected result, I am just trying to make sure there are no other side effects or workflows that act on the generated data in between in order to get them to look like the final uploaded artefact that we currently have.

There aren't any other workflows to process snapshots when they are updated following the normal process (passing --update-snapshots or -u to Playwright). The process @dhruvkb described to you using the test results zip is a workaround/hack to actually running Playwright with -u, which tells it to update the snapshots in place.

When running playwright -u, instead of generating the test results zip, it replaces existing snapshots with the new one from the test run if they don't match. So when running git add ., it stages all these snapshots, and git diff --staged --binary creates a patch to be applied that includes the snapshots in place.

I am just trying to make sure there are no other side effects or workflows that act on the generated data in between in order to get them to look like the final uploaded artefact that we currently have.

That is correct, the changes created by playwright -u are meant to be committed directly without further modification.

Comes down to the difference between that test results zip, which is the Playwright trace file we've configured both Playwright test suites to generate when tests fail, and the process of updating snapshots.

That being said, we should upload both the trace and the new gzipped patch file. The trace is still useful for stepping through true unexpected failures, especially if something only fails in CI (rare, but it's happened before).

An outstanding question to me is whether Playwright generates traces for tests where the snapshot updated. It won't consider those test runs failures because -u basically tells it to transform snapshot failures as updates. That's why the exit code detail matters: -u would turn failing snapshots into updated snaphots and a non-zero exit code to 0 if the only "failures" were snapshots. But does that mean the test results zip won't include the trace for those tests? I'll test this out locally next week, it should be easy enough to verify either way.

Oh, I also just learned about trace.playwright.dev (https://playwright.dev/docs/trace-viewer#viewing-remote-traces)! It's pretty cool! It would be nifty to update the Playwright bot comment to create a link to that with the uploaded artefact if that's possible :open_mouth:. Separate issue from this specific one, but it would help make understanding Playwright failures a tiny bit more accessible. The trace viewer still requires a bit of learning to use effectively in my experience, but just opening it in our development environment is a non-trivial endeavour.

Thank you for the clarifications @sarayourfriend. And yeah, checked out trace.playwright.dev and I co-agree it is cool! I am going to go ahead and try my hands on this.

WordPress / openverse