Advice for handling visual discrepancies between Intel and Apple Silicon images

americanexpress / jest-image-snapshot

✨ Jest matcher for image comparisons. Most commonly used for visual regression testing.

Apache License 2.0

3.85k stars 200 forks source link

Advice for handling visual discrepancies between Intel and Apple Silicon images #302

Open ReDrUm opened 2 years ago

ReDrUm commented 2 years ago

Does anyone have advice on how best to handle the visual discrepancies between an image produced on an Intel machine vs an Apple Silicon machine for VR testing?

I'm encountering differences in how drop shadows are rendered. Enough to trip up jest-image-snapshot unless i set the pixel tolerance excessively high.

My setup is running Chromium via Puppeteer on a multi-arch docker image running Debian 12. It seems like shadow differences are common

Has anyone tackled this problem successfully yet?

unional commented 2 years ago

There are also discrepancies between macOS and Ubuntu.

The font weight rendered in Ubuntu is lighter compare to macOS. This causes the test to fail as I use macOS locally but the CI is based on Ubuntu.

unional commented 2 years ago

Here is a comparison. Above is from macOS, below is from Ubuntu:

reslys commented 1 year ago

Hi, any updates for this issue? I also encounter discrepancies between Intel and Apple Silicon images and still have no idea how to handle this.

unional commented 1 year ago

For me, what I ended up doing is to create different sets of snapshots for different OS/platform.

dongzoolee commented 1 year ago

I've just ended up setting a failureThreshold option. 😢

expect(screenshot).toMatchImageSnapshot({
  failureThreshold: 0.005,
  failureThresholdType: 'percent'
});

reference: The LogRocket Blog Article

joriswitteman commented 1 year ago

I run the tests in an Ubuntu container using Docker for deterministic results.

Dockerfile:

FROM mcr.microsoft.com/playwright:v1.35.1-jammy

# Set the work directory for the script that follows
WORKDIR /test

# Copy visual-testing package.json
COPY package.json ./

# Install dependencies
RUN yarn

# Copy current source directory
COPY . .

My yarn scripts:

  "scripts": {
    "test": "yarn stop && docker run --name visual-testing --network host --add-host=host.docker.internal:host-gateway -v ${PWD}/baseline-snapshots:/test/baseline-snapshots -v ${PWD}/failure-diffs:/test/failure-diffs --rm visual-testing yarn container:wait-then-test",
    "container:run-test": "yarn test-storybook --stories-json --ci --url http://host.docker.internal:6006",
    "container:wait-then-test": "yarn container:wait-for-storybook && yarn container:run-test",
    "container:wait-for-storybook": "yarn wait-on -i 5000 -t 600000 http://host.docker.internal:6006"
  },

programmer24601 commented 1 year ago

We've opted for this pragmatic approach:

if (process.platform === 'darwin') {
  expect(pngBuffer).toMatchImageSnapshot({
    failureThreshold: 0.00009,
    failureThresholdType: 'percent'
  });
} else {
  expect(pngBuffer).toMatchImageSnapshot();
}

Not ideal, but since CI isn't darwin nothing should slip through undetected.

unional commented 1 year ago

https://github.com/justland/just-web-react/actions/runs/5792587155/job/15699083890

Expected image to match or be a close match to snapshot but was 0.06825086805555555% different from snapshot (629 differing pixels).

I have this case where the snapshot generated locally from ubuntu in WSL and the one in the CI doesn't match.

process.platform are both linux in this case.... 🤷

ronilitman commented 1 year ago

For me, what I ended up doing is to create different sets of snapshots for different OS/platform.

How did you do that?

unional commented 1 year ago

How did you do that?

I do this:

customSnapshotsDir: `${process.cwd()}/__snapshots__/${process.platform}`

OnyxWest commented 9 months ago

The rendering discrepancy has caused text layout to shift and wrap depending on if I run the tests locally (Apple silicon) vs in an Ubuntu CI pipeline that the difference between images is greater than 30%. This makes setting a custom failure threshold unhelpful as it will not catch smaller changes anymore, which IMO is the whole point of image snapshot testing.

I am attempting to run them locally in a docker container now, but the issue I run into is that it maxes out the CPU and causes tests to start timing out (even with a generous test timeout).