Open aslushnikov opened 3 years ago
I think https://github.com/americanexpress/jest-image-snapshot provides a nice suite of options for various VRT scenarios. Test scenarios vary widely, depending on the context (testing components, whole pages, text-heavy or not, etc).
Besides bluring which helps a lot with antialiasing it would be nice if multiple image comparisons (e. g. SSM) would be possible. Alternative image comparison algorithms could be left to userland, if they can be plugged into toMatchSnapshot
via a common interface.
Besides bluring which helps a lot with antialiasing it would be nice if multiple image comparisons (e. g. SSM) would be possible.
@florianbepunkt What's SSM? Is it structural similarity measurement (SSIM)?
@aslushnikov Yes, typo.
Also a UI for reviewing the diffs would be great.
@kevinmpowell What's the one that you find most handy? Is it a "slider" diff like here:
I actually prefer the pixel highlighting (like Playwright already does), but organize all the failing tests in a UI so I can see what failed without having to poke around three different images.
Also being able to A/B toggle the baseline and the test image is nice in some cases.
Slider is rarely useful for me. An onion-skin (transparency overlay) would be more useful.
@aslushnikov Why toMatchSnapshot()
is not available in the documentation?
It can not be found in API list.
And the article that was in 1.13 https://playwright.dev/python/docs/1.13.0/test-snapshots
is not available for 1.14 anymore.
Thanks for thinking about Visual Regression testing. Thats important!
On a related note: It would be great if tests could be run cross-plattform. Currently the os platform name is baked into the snapshot filename, so our CI tests sometime fail due to name miss-match. https://github.com/microsoft/playwright/issues/7575
support for blur in matching snapshot to counteract antialiasing
It would be nice if we could choose whether we want to apply such image filters before the snapshot is being saved or only when doing the comparison. I would prefer the first option as it keeps the diff small when creating new snapshots even of such images that change randomly / are flaky.
Please allow an auto-generated filename when toMatchSnapshot has no name input, similar to how toMatchSnapshot works in Jest.
E.g.
// foo.spec.ts toMatchSnapshot() => foo.spec.ts.snap (default extension customizable in playwright.config.ts)
When you have a lot of screenshot assertions in one file, we can avoid writing a lot of filename inputs:
Thanks for thinking on this, blur feature is something that will help us, we have something similar before with puppeter that help us to do comparisson in animated pages, in addition to that something that can be really useful is be able to ignore specific parts of the screen, specially in those parts where we have more dynamic data(videos/images)
Blur would help us greatly. Also, the slider view would be incredible as well.
We're also really interested in these improvements. We had to disable visual tests for now because they are randomly failing because a few pixels are off, even when increasing the threshold. Blur should help here hopefully.
I suggest solving biggest pain-point which is how to store this stuff in git repo so it doesn't blow up in size (to store only last snapshot). Git LFS kinda works but it's painful. Maybe something else would work better? For reference: https://github.com/americanexpress/jest-image-snapshot/issues/92
Would be great if these snapshot dirs were automatically marked in git to only store last revision.
We're using Git LFS, what's your issue with it? Once we had it set up for everyone (we're using Mac, Windows and Linux), it worked fine. We're storing all images in the repo using Git LFS (*.png
) so there's no work involved when adding snapshots to new tests either.
The only issue I have is comparing the image diffs in VS Code when committing new images as the old image is not shown in the diff view. The diff is working fine in the GitLab merge request view though so that's not a big issue.
Hi @aslushnikov! This was pushed to the next version a few times now, could you please add this to the roadmap (if there is one?) so we can have a rough estimate on when this is coming?
I need to implement some visual tests soon™️ and it would be great if I wouldn't need another tool for that. I need to know if there will be improvements to this in 2 months or 2 years though.
Hey @z0n, there's no roadmap. My guesstimate is that we'll have all the pieces together by summer 2022, the priority of VRT keeps raising.
We're using Git LFS, what's your issue with it?
It works for me but for example wanted to use it in one company that had poor infra and it didn't worked well with Jenskins for example, so I couldn't easily bypass it.
Also Git LFS worked weird with rebases and people had a lot of trouble with it when jumping between branches if I remember correctly.
It works but experience is suboptimal.
Hey folks! Here's an update on screenshots and blurring.
I see lots of you requested a "blur" option to pre-blur images before comparison. While I imagine it can help with certain issues, it's a very big hammer, so I wonder if we can do a more delicate job.
Many folks mentioned that they want pre-blur to avoid snapshot failures due to a few pixel differences.
A new options has landed on tip-of-tree: pixelCount
and pixelRatio
. These a supposed to help in these cases. Please give them a try and let me know, if you still need preblur!
$ npm i @playwright/test@next
Thank you for improving visual regression features, @aslushnikov!
You may find the implementation experience of gemini-testing project useful. Some pointers:
Several years ago we've great success using Gemini for visual regression testing. We used Gemini built-in web UI (either Gemini GUI or html-reporter - don't remember which) to choose changed images worth committing to Git. And during PR review we used built-in GitHub image diff. We had with very few false positives in image diffing. Unfortunately, false positives rate was not zero - mostly due to subtle browser timing/random fluctuations.
Gemini is deprecated now, replaced by Hermione, from the same authors. I haven't used it, but it seems to use the same approach for image diffing. The core is in looks-same
and gemini-core libraries.
Thanks @shamrin for the pointers! I'll read your links in more details later to get a better understanding, but so far we already do all of these:
pixelmatch
uses color difference in YIQ color spacepixelmatch
uses the same algorithm based on the same whitepaper to ignore anti-aliasingHey! @aslushnikov I updated @florianbepunkt's original port of jest-image-snapshot to playwright test runner here: https://github.com/ayroblu/playwright-image-snapshot. Basically it looks VERY similar to playwright's existing golden.ts compare api and as you can see in matcher.ts.
The main benefit it is that it uses SSIM. I also updated how the diff is done so it's similar to pixelmatch's greyscale background which is super useful.
expect(await page.screenshot()).toMatchImageSnapshot(test.info(), [
name,
"1-initial-load.png",
]);
Would love to have this SSIM option ported to playwright test as TestInfo
is not exposed implicitly which makes the api usage a bit ugly. Made a PR #12258.
I'm also hoping not to need to supply a file name by default, seems unnecessary.
For the record: docker integration depends on global fixtures, so moving them forward.
Hi, @aslushnikov! Is it possible that in the next releases you will implement "slider" diff in the html report? There are cases where the slider is more convenient than the pixel highlighting method, especially when the length of the expected and actual screenshots differs.
It would be possible to implement one more tab in the report by analogy with Diff/Actual/Expected?
or you can display all 3 states on one tab in the report (as it looks in the attachments of this comment)
@bezyakina not sure for 1.21 (we're about to finalize this version), but still possible! It all depends on how much our users need it.
So could you please file this separately to our bug tracker as a feature request? The more likes / upvotes it will collect, the higher priority will be for us, and the faster we'll implement it!
@bezyakina not sure for 1.21 (we're about to finalize this version), but still possible! It all depends on how much our users need it.
So could you please file this separately to our bug tracker as a feature request? The more likes / upvotes it will collect, the higher priority will be for us, and the faster we'll implement it!
thanks for your reply, created a new feature request - https://github.com/microsoft/playwright/issues/13176
Hey there! Not sure if would be better to open another feature request, but https://github.com/jz-jess/RobotEyes has an interesting feature to ignore an array of UI elements in the image comparison, as these elements will be blurred, helping to achieve a higher percentage of fidelity (+95%) comparison. RobotEyes uses Imagemagick in the background which is a really powerful tool for image comparison. The idea is to ignore data elements from the screen before comparison is done. Taking that into account would require to set a different tolerance for each web page in the application, as each one can have different amount of UI elements with data. I've seen comments about blur, but it doesn't seem to be related to this... Thank you.
@AllanMedeiros you can use the mask
api to mask elements on the screenshot. This should help!
Would it be possible to do visual diff's even when the snapshot sizes differ (Sizes differ; expected image...
)? Right now it seems the Playwright VRT refuses to do visual diffs for such snapshots.
Storybook addon storyshots at least has this feature, though it might come from pixelmatch, not sure. It's very convenient, as often there's white space changes and you can easily see that some padding has appeared somewhere.
Hi @aslushnikov Can you add an option for ignoring diffs where there is just some slight shift in location of pixels? I don't want to use threshold or maxDiffPixels for this because those options would cause false positives, i.e. they would cause the tests to ignore actual regressions. Here is an example of a diff that I would like to ignore: Actual image: Expected image: The diff:
Thank you very much.
Hi @aslushnikov
For now playwright can not compare the reference screenshot with actual instance in case if they have not equal resolution.
This is not a big problem. But when I try to update such screenshot I also faced the error message about resolution mismatch.
Error: Image sizes do not match.
It is necessary to find such a screenshot and delete it. Then run the update again. This is not very convenient, for example, after updating the browser version. When many screenshots can be rendered differently.
@KirProkopchik different browsers and machines will produce different results. Have you tried running e2e tests within a docker image? We've been doing this for a few months and the results have been quite stable
I agree that adding support for blurring would be extremely helpful. We are making use of maxDiffPixelRatio
, but it is much of a guessing game of what value is appropriate to counter antialiasing/rendering differences vs. actual issues.
For example, I am currently having to set the value to 0.0005
to catch issues like this one (a real issue):
While wanting to skip/ignore false positives like this:
I worry with just the arbitrary pixel ratio of pixel differences, we will be adjusting that value overtime as certain screenshots will be more impacted by antialiasing and rendering imperfections.
We use Playwright for Ionic Framework, and test against Chrome/Firefox/Safari to verify correct Android and iOS design implementations for the web components.
We have an interesting problem that's probably a common case when doing visual regression testing: we're taking a screenshot of an element (selected with a locator) that has a non-integer height. This results in an interesting problem where (at least when the device pixel ratio is 1) depending on what is on the rest of the page, sometimes the screenshot has a different height, or includes one extra row of the background color outside the element. I think this could even happen for elements that have an integer height, but that are positioned around elements that don't.
I think the most useful case for visual regression testing is for individual elements, and this does make it very hard to test those if you have any sub-pixel heights (or widths) on the page.
We have an interesting problem that's probably a common case when doing visual regression testing: we're taking a screenshot of an element (selected with a locator) .... I think the most useful case for visual regression testing is for individual elements, and this does make it very hard to test those if you have any sub-pixel heights (or widths) on the page.
I experienced this issue and my workaround was to take an image of the entire page and then crop the image to select the desired element before comparing.
const dimensions = await element.boundingBox(); expect(await page.screenshot({ type: 'jpeg', clip: dimensions as {x; y; width; height }, })).toMatchSnapshot(
${name}.jpeg, {});
With the release of 1.21, Playwright now has the "Slider Diff View" which is great for comparing visual changes on the .toMatchSnapshot()
assertion.
I'm curious how others plan to incorporate this into their software development workflow! It seems like the biggest piece missing from Playwright now is the ability to approve changes outside of running the app locally. (This is where backstopjs is still useful). Have others come up with a way to create some type of workflow on the PR that allows teams to easily review and approve changes?
To be clear, I'm not necessarily saying that this should be part of Playwright.
Is there any way to take one screenshot of the entire page? We have many cases of "long pages" and for now, we are instructing our test script to, multiple times, scroll down the page and take a snapshot until we reach the end of the page. Then, we compare all the screenshots.
Is there an easier way to do it? If not, do u intend to add this kind of option?
Thanks in advance!
@anduingaiden Please see: https://playwright.dev/docs/screenshots#full-page-screenshots
@anduingaiden Please see: https://playwright.dev/docs/screenshots#full-page-screenshots
It's clear that I did not explore this part of the documentation yet. Sorry about that and thank you very much.
Another example of where text rendering causes flake and blurring could've helped —
Hi @pastelsky, are those screenshots taken on different operating systems? In a like-for-like environment, you shouldn't see flake from text rendering differences.
Different operating systems can have different default fonts, and (which appears to be the case here) different text rendering approaches.
Hey folks! if you have examples of PNG screenshots that are taken on the same browser and same OS yet are different due to anti-aliasing issues, could you please attach the "expected", "actual" and "diff" images here?
This information will help with our experiments with fighting browser rendering non-determinism.
I'm curious how others plan to incorporate this into their software development workflow! It seems like the biggest piece missing from Playwright now is the ability to approve changes outside of running the app locally. (This is where backstopjs is still useful). Have others come up with a way to create some type of workflow on the PR that allows teams to easily review and approve changes?
I have come up with a system where the devs can comment on the PR to run a CI task that reruns the tests with an --update-snapshots
flag and pushes the any changes to the PR branch. But this requires rerunning the entire tests again, which is pretty slow considering the test report from the last run already has the accepted new snapshots in it.
It would be nice to have some kind of "accept snapshots" command we could run that takes an output from a test run where the snapshots comparison failed and it updates them from that. Even if it needs some kind of special report format, that would speed up this part of the workflow considerably.
@gselsidi thank you for the sample!
I'll try to get some more as they come along, but i noticed the above occurs when taking snapshots of individual elements as opposed to the whole page. The whole page I'm able to use .0001 pixel ratio.
linking this here incase it applies:
I had different screenshots with antialiased fonts between my ArchLinux laptop and Ubuntu 20.04 in Docker (it's used by default by GitHub Actions). The following Chromium flags helped me to get identical screenshots:
--font-render-hinting=none
--disable-skia-runtime-opts
--disable-font-subpixel-positioning
--disable-lcd-text
We have similar issues with webkit on mac around emojis, I am not sure if we can provide further information to make debugging/fixing the issue easier?
It looks like mask is not available to configure at PlaywrightTestConfig
level?
Playwright Test has a built-in
toMatchSnapshot()
method to power Visual Regression Testing (VRT).However, VRT is still challenging due to variances in the host environments. There's a bunch of measures we can do right away to drastically improve experience in @playwright/test
docker
test fixture to run browsers inside docker image.blur
in matching snapshot to counteract antialiasingInteresting context: