pedrovgs / Shot

Screenshot testing library for Android
Apache License 2.0
1.17k stars 113 forks source link

Tests fail when mixing up x86 and Apple Silicon (M1 and the like) #265

Open cristan opened 2 years ago

cristan commented 2 years ago

When a screenshot is created at an x86_64 emulator and is then tested on an arm64-v8a emulator on Apple Silicon, I expect the test to succeed. Instead, it fails, even though both PNGS look exactly the same (example attached). Oddly enough, even the diff looks exactly the same as the 2 other issues.

I expect this test to succeed considering the images look exactly the same.

Original screenshot (x86_64) Original screenshot (x86_64)

New screenshot (arm64-v8a) New screenshot (arm64-v8a)

Diff Diff

Using an x86_64 emulator through Rosetta is not an option: it crashes when I try to start it: The emulator process for AVD emulatorName has terminated.

A possible solution which would work for now is to define a threshold how much the images have to be the same before the test fails. Setting it to 99% should be plenty.

sergio-sastre commented 2 years ago

I believe this is something to be expected if running on different architectures due to hardware rendering. You could try to enable software accelerated: myView.setLayerType(View.LAYER_TYPE_SOFTWARE, null)

more on that here: https://developer.android.com/guide/topics/graphics/hardware-accel.html

moreover, the Shot documentation recommends to configure your emulators using the gpu mode swiftshader_indirect

cristan commented 2 years ago

I use shot to test Jetpack Compose screens and I don't think I can change the layer type here. I already have the gpu mode set to swiftshader_indirect on both platforms.

andreroggeri commented 2 years ago

The images are a bit different if you check with another tool (I used ImageMagick)

diff

I had the same issue with pixel emulator vs pixel device, and in my case setting a low tolerance was ok

cristan commented 2 years ago

@andreroggeri Does shot support setting a tolerance? The only thing I could find is #28 and it is essentially solved as a won't fix.

andreroggeri commented 2 years ago

Yes it does.

On your build.gradle you can configure shot

shot {
    tolerance = 0.1 // 0,1% tolerance
}
cristan commented 2 years ago

That works :D The biggest difference in pixels between platforms in my tests is 1.4%, so I set the tolerance now to 2% and this fixes my tests :D

pedrovgs commented 2 years ago

Hi @cristan thanks for reporting. I'm afraid we depend on an API we can't control to take the screenshot. It depends on the emulator/device where the screenshot is taken to render the bitmap. This means if you use different hardware for your emulator the screenshot taken can change too 😢 I'm afraid the only solution we can provide is to use tolerance param as @sergio-sastre pointed out. I've sent a PR updating the docs because I totally forgot to add the docs about this config param: https://github.com/pedrovgs/Shot/pull/270

adrinieto commented 2 years ago

We're having the same problem, but tolerance doesn't solve the problem. In our test suite we have some tests that fail with a 7% pixel difference, and one with 18%. Setting a tolerance so high is not reasonable.

I've been experimenting and I think the tolerance implementation can be improved. I've been looking at the differences at a pixel level, and the pixels are almost the same.

The current implementation expects that the pixels in both images are equals, and has a global tolerance to check the percentage of different pixels. It would be great if we could add a tolerance for each RGB channel, to check not for equal pixels but for almost equal pixels.

One example, these 2 images have 18.5% different pixels. Looking at the background of the first button we see that there is a difference of 1 in two of the three RGB channels. It happens the same in the rest of the image, this very small difference is causing that 18.5% of the image is reported to be different.

x86_64

R: 32 G: 39 B: 210

arm64-v8a

R: 31 G: 38 B: 210

Diff (18.51% different pixels)

Online version (select "Ignore nothing"): https://www.textcompare.org/image/?id=61fc025597948c5db943d97d

This tool uses Resemble.js library that already has this feature, by default it ignores almost equal pixels.

https://github.com/rsmbl/Resemble.js/blob/master/resemble.js#L147-L155

I think it would be easy to add this feature, we need to add a new colorTolerance (for example) and change this filter:

https://github.com/pedrovgs/Shot/blob/7bd9a4b18e5f31bc07c329d6f9c93da47769b91b/core/src/main/scala/com/karumi/shot/screenshots/ScreenshotsComparator.scala#L53-L54

pedrovgs commented 2 years ago

After thinking about this issue and checking @adrinieto 's implementation I think we can add this workaround to our project for now. It's far from being the perfect solution but is good enough for now. We'll check some details in the original PR and we'll try to see if we can add color tolerance to the project. Thank you all for your comments 😃

primoz-ivancic-ng commented 2 years ago

After thinking about this issue and checking @adrinieto 's implementation I think we can add this workaround to our project for now.

Any news on this? Is the current workaround available anywhere? I'm working on an OS library for some UI components and I'm getting way above 10% diff on some images that are otherwise the same - mostly where there is transparency involved. The (main) source of issues here is the M1 vs x86 diff.

zaaaach commented 1 year ago

Just want to add that I'm getting the same issues when running my tests on a CICD platform (in this case, Bitrise). When using a merge tool and looking at the diff, it seems the pixels on a border/color transition are typically what is flagged as different. Often even with the diff they are completely undetectable with the human eye.

NaSOS commented 1 year ago

After thinking about this issue and checking @adrinieto 's implementation I think we can add this workaround to our project for now. It's far from being the perfect solution but is good enough for now. We'll check some details in the original PR and we'll try to see if we can add color tolerance to the project. Thank you all for your comments 😃

Another alternative would be to investigate solutions followed by other frameworks, maybe they could be applied here as well.

Take, for example, swift-snapshot-testing#628.

tdrhq commented 1 year ago

At Facebook, the way we solved issues like this was to just generate screenshots on the CI. So developers never recorded screenshots, and we're always recording screenshots in a clean identical environment. If screenshots changed, we just notified them on their diff.

(If you want this workflow, consider using something like Screenshotbot. Open source at https://github.com/screenshotbot/screenshotbot-oss)