Closed omnisip closed 4 years ago
Thanks for opening your first issue. Pull requests are always welcome!
Thanks @omnisip this is interesting!
If you are willing to contribute I think this would be an interesting option to add but before spending too much time on it could you quantify your statement about reducing false positives also spend some time comparing the performance of the diff process (is it faster or slower to diff with SSIM?)
Also looking at the readme for https://github.com/obartra/ssim I have concerns over requiring node-gyp. If that is really the case then I don't think I'd want this implemented.
If perf is comparable, false positives are drastically reduced, and node-gyp is not required then feel free to contribute! The reason I am being so picky about this is I want to make sure there is real benefit to users before adding an option that could overcomplicate the code and add a larger surface area to what was supposed to be a very simple jest utility.
I expect it to be significantly faster and definitely more accurate, but we won't know until we try. Short answer to your node gyp question, is it's probably required since it uses 'canvas' as a dependency.
But that's also the reason it's much faster. Pixelmatch has a serious disadvantage if being in js when lots of vector math is required. It's actually a double whammy even if it's using uint32 arrays under the hood because JavaScript only has floating point numbers.
That said, if there's a way to stub out the implementation for pixelmatch it's still worth a try. The accuracy difference will be night and day better and it'll be a lot faster.
Dan
On Sat, Apr 25, 2020, 12:44 Andres Escobar notifications@github.com wrote:
Thanks @omnisip https://github.com/omnisip this is interesting!
If you are willing to contribute I think this would be an interesting option to add but before spending too much time on it could you quantify your statement about reducing false positives also spend some time comparing the performance of the diff process (is it faster or slower to diff with SSIM?)
Also looking at the readme for https://github.com/obartra/ssim I have concerns over requiring node-gyp. If that is really the case then I don't think I'd want this implemented.
If perf is comparable, false positives are drastically reduced, and node-gyp is not required then feel free to contribute! The reason I am being so picky about this is I want to make sure there is real benefit to users before adding an option that could overcomplicate the code and add a larger surface area to what was supposed to be a very simple jest utility.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/americanexpress/jest-image-snapshot/issues/201#issuecomment-619423743, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKQJXU4OAZXSRKFYV6PITROMVQJANCNFSM4MPY3HVA .
Go ahead and try it out @omnisip!
I was incorrect about it requiring separate dependencies or OS specific dependencies. That's only if it needs a special image loader.
Here is an example of how much better it works.
These two files are the same -- except one has been converted to jpg. The SSIM is 0.9999278172146295 with the fastest transformation (bezkrovny).
Using a pixelmatch or an equivalent algorithm, I end up with 96K pixels as being different and roughly 2.57% error. That's high for something that is perceptually impossible to distinguish.
On Wed, Apr 29, 2020 at 4:34 PM Andres Escobar notifications@github.com wrote:
Go ahead and try it out @omnisip https://github.com/omnisip!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/americanexpress/jest-image-snapshot/issues/201#issuecomment-621324817, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKQJWUGHUIULGR65CV3R3RPBJJJANCNFSM4MPY3HVA .
This is great news! How much faster is it?
How do we build a good test? Do you have a set of sample images you want to churn through?
On Thu, May 14, 2020, 17:19 Andres Escobar notifications@github.com wrote:
This is great news! How much faster is it?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/americanexpress/jest-image-snapshot/issues/201#issuecomment-628935814, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKQJR7PO7JKCYSK76LQQDRRR36NANCNFSM4MPY3HVA .
We have not done much in terms of performance testing before but we do have integration tests defined in https://github.com/americanexpress/jest-image-snapshot/blob/master/__tests__/integration.spec.js and a set of test images used by those tests in https://github.com/americanexpress/jest-image-snapshot/tree/master/__tests__/stubs so you could use that.
I'll check them. It's going to sound strange, but PNG might not be ideal for the analysis itself. You'd think it'd be best, but it's performance is going to die when text and images are merged together, it's also not particularly fast at decoding.
Jpeg might be a lot better for this application since it will always be first pass (single generation loss) and the files will be a lot smaller. I need to check Huffman coding again, but I'm pretty sure it'll dedupe better too because of the way the instead format is designed in macroblocks. Meaning less git repository bloat.
This option isn't possible with a pure pixel by pixel matching solution but is possible with an ssim solution.
I'll do some research and see what I come back with.
Dan
On Thu, May 14, 2020, 21:03 Andres Escobar notifications@github.com wrote:
We have not done much in terms of performance testing before but we do have integration tests defined in https://github.com/americanexpress/jest-image-snapshot/blob/master/__tests__/integration.spec.js and a set of test images in https://github.com/americanexpress/jest-image-snapshot/tree/master/__tests__/stubs
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/americanexpress/jest-image-snapshot/issues/201#issuecomment-628999311, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKQJVN3UBVCBRO7EEDH3LRRSWIVANCNFSM4MPY3HVA .
Preliminary analysis shows pixelmatch being faster by about 5x in your best cases (where there's a lot of identical adjacent pixels e.g. same color), and 2x slower in your worst cases based off of the samples in your repository.
Analysis wise though, there's no contest. SSIM is significantly better.
E.g. TestImage vs. TestImageFailure returns 21.6% difference in pixels per pixelmatch, whereas SSIM returns a score of 11% similiarity (with bezkrovny model) and 21% similiarity
In your oversize case (LargeTestImage*) -- pixelmatch shows only a 1.2% difference in pixels. However, SSIM shows a 96.7% similarity with both bezkrovny and standard models
I'm thinking one could reasonably set a threshold with SSIM at 99% and never think twice about it.
Dan
On Fri, May 15, 2020 at 3:14 AM Dan Weber dweber@gmail.com wrote:
I'll check them. It's going to sound strange, but PNG might not be ideal for the analysis itself. You'd think it'd be best, but it's performance is going to die when text and images are merged together, it's also not particularly fast at decoding.
Jpeg might be a lot better for this application since it will always be first pass (single generation loss) and the files will be a lot smaller. I need to check Huffman coding again, but I'm pretty sure it'll dedupe better too because of the way the instead format is designed in macroblocks. Meaning less git repository bloat.
This option isn't possible with a pure pixel by pixel matching solution but is possible with an ssim solution.
I'll do some research and see what I come back with.
Dan
On Thu, May 14, 2020, 21:03 Andres Escobar notifications@github.com wrote:
We have not done much in terms of performance testing before but we do have integration tests defined in https://github.com/americanexpress/jest-image-snapshot/blob/master/__tests__/integration.spec.js and a set of test images in https://github.com/americanexpress/jest-image-snapshot/tree/master/__tests__/stubs
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/americanexpress/jest-image-snapshot/issues/201#issuecomment-628999311, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKQJVN3UBVCBRO7EEDH3LRRSWIVANCNFSM4MPY3HVA .
Want to open a PR to add this as an option? The reason I am saying it should be an option is that I don't think we should have another breaking change anytime soon.
this sounds really convincing. Can't wait to see further progress here
En route guys, en route.
On Tue, Jun 2, 2020 at 9:31 AM Manuel Dugué notifications@github.com wrote:
this sounds really convincing. Can't wait to see further progress here
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/americanexpress/jest-image-snapshot/issues/201#issuecomment-637417238, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKQJSPMMOMP7WCFVYIDE3RUTBH7ANCNFSM4MPY3HVA .
Want to open a PR to add this as an option? The reason I am saying it should be an option is that I don't think we should have another breaking change anytime soon.
The PR should work perfectly without any breaking changes to existing users. SSIM is implemented as a new comparisonMethod with the default being pixelmatch. I'm really looking forward to seeing and hearing your feedback! :-)
This issue is stale because it has been open 30 days with no activity.
Thank you @omnisip for the excellent work.
I tried comparisonMethod: 'ssim'
this morning and so far I am pleased with the results. Like others, I was running into issues comparing screenshots with different operating system and browser combinations (specifically, using Jest and Playwright to test Chromium, Firefox, and Webkit on macOS, Ubuntu, and Windows). I know that the recommendation from the maintainers is to test inside of a docker container to avoid false positives, but this would involve significantly more work than just have jest-image-snapshot
perform smarter comparisons.
Here is an example of an image snapshot (from Chromium) that we're testing against:
The challenge (as has been covered in other issues) is how to handle the visual differences between screenshots generated using the same browser (which should match) on difference operating systems. Previously, I was using the following settings for pixel-based comparisons:
customDiffConfig: {
threshold: 0.3,
},
failureThreshold: 0.04
These more-or-less worked, but it felt like significant accuracy was being lost by setting threshold
to 0.3
. After setting comparisonMethod
to ssim
, I am now using the following settings:
failureThreshold: 0.15
The recommended default setting of failureThreshold: 0.01
resulted in an ~12% difference between the above screenshot rendered in headless Chromium on both macOS and Ubuntu, hence the 0.15
setting used (I wanted to give myself a little wiggle room for future tests).
It's hard to tell which combination of comparison method and threshold settings allow for greater accuracy, but my gut tells me ssim
will be less picky in the long run. Then again, maybe I'm just favoring The Shiny New Thing. Time will tell.
Can you send me the samples to compare? Also the diff image you did with them? I'd like to see these at 0.01 with both ssim: 'bezkrovny' and ssim: 'fast'. [Note 'fast' is not faster than bezkrovny, but it is more accurate.]
On Thu, Jul 23, 2020 at 10:20 PM John Hildenbiddle notifications@github.com wrote:
Thank you @omnisip https://github.com/omnisip for the excellent work.
I tried comparisonMethod: 'ssim' this morning and so far I am pleased with the results. Like others, I was running into issues comparing screenshots with different operating system and browser combinations (specifically, using Jest and Playwright to test Chromium, Firefox, and Webkit on macOS, Ubuntu, and Windows). I know that the recommendation from the maintainers is to test inside of a docker container to avoid false positives, but this would involve significantly more work than just have jest-image-snapshot perform smarter comparisons.
Here is an example of an image snapshot (from Chromium) that we're testing against:
[image: example-test-js-example-tests-image-snapshots-1-chromium-snap] https://user-images.githubusercontent.com/442527/88342792-17c14600-cd0e-11ea-8322-4d39ccb8b15d.png
The challenge (as has been covered in other issues) is how to handle the visual differences between screenshots generated using the same browser (which should match) on difference operating systems. Previously, I was using the following settings for pixel-based comparisons:
customDiffConfig: { threshold: 0.3,},failureThreshold: 0.04
These more-or-less worked, but it felt like significant accuracy was being lost by setting threshold to 0.3. After setting comparisonMethod to ssim, I am now using the following settings:
failureThreshold: 0.15
The recommended default setting of failureThreshold: 0.01 resulted in an ~12% difference between the above screenshot rendered in headless Chromium on both macOS and Ubuntu, hence the 0.15 setting used (I wanted to give myself a little wiggle room for future tests).
It's hard to tell which combination of comparison method and threshold settings allow for greater accuracy, but my gut tells me ssim will be less picky in the long run. Then again, maybe I'm just favoring The Shiny New Thing. Time will tell.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/americanexpress/jest-image-snapshot/issues/201#issuecomment-663260247, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKQJVB25XLMPZEC4AJAL3R5CZRDANCNFSM4MPY3HVA .
@omnisip --
Of course. For what it's worth, these are just example tests I'm using while I get our e2e configuration in place (switching from Cypress.io). I was surprised to learn just how much text rendering differences alone can complicate screenshot comparisons.
These snapshots were generated on macOS using playwright's page.screenshot()
feature.
macos-chromium
macos-firefox
macos-webkit
Our CI test matrix renders the same content seen in the snapshots above using Chromium, Firefox, and Webkit on macOS, Ubuntu, and Windows. Screenshots are taken for each os+browser combination, which are then compared to the reference snapshots of the matching browser type. For example, ubuntu-chromium.png
will be compared to macos-chromium.png
but not to macos-firefox.png
or macos-webkit.png
. This is done in hopes making screenshot comparisons more accurate.
Here are the diff images and statistics generated by jest-image-snapshot:
ubuntu-chromium 11.116726154016055% different from snapshot (102451.74823541196 differing pixels)
ubuntu-firefox 0.4187165861719522% different from snapshot (3858.8920581607113 differing pixels)
ubuntu-webkit 12.17770280391608% different from snapshot (112229.7090408906 differing pixels)
windows-chromium 1.4968674877126276% different from snapshot (13795.130766759576 differing pixels)
windows-firefox 2.087652134020601% different from snapshot (19239.80206713386 differing pixels)
windows-webkit NOTE: A font rendering issue causes windows+webkit to have an unusually high diff percentage 26.405261426690828% different from snapshot (243350.8893083827 differing pixels)
Thanks for taking a look at these, btw. Very much appreciated!
Okay cool.
1) Are you using the default ssim choice or 'fast'?
2) Are there any size mismatches and/or do you have allowSizeMismatch turned on?
3) How are you controlling for different versions of each browser on each platform?
On Thu, Jul 23, 2020, 20:54 John Hildenbiddle notifications@github.com wrote:
@omnisip https://github.com/omnisip --
Happy to. Original Snapshots
These snapshots were generated on macOS using playwright's page.screenshot() feature.
macos-chromium [image: macos-chromium-snap] https://user-images.githubusercontent.com/442527/88356059-929d5780-cd34-11ea-95fa-1d9a676883ef.png
macos-firefox [image: macos-firefox-snap] https://user-images.githubusercontent.com/442527/88356062-93ce8480-cd34-11ea-9030-8a5c9468b232.png
macos-webkit [image: macos-webkit-snap] https://user-images.githubusercontent.com/442527/88356065-94671b00-cd34-11ea-8346-1ffa322c026c.png Image Diffs
Our CI test matrix renders the same content seen in the snapshots above using Chromium, Firefox, and Webkit on macOS, Ubuntu, and Windows. Screenshots are taken for each os+browser combination, which are then compared to the reference snapshots of the matching browser type. For example, ubuntu-chromium.png will be compared to macos-chromium.png but not to macos-firefox.png or macos-webkit.png. This is done in hopes making screenshot comparisons more accurate.
Here are the diff images and statistics generated by jest-image-snapshot https://github.com/americanexpress/jest-image-snapshot:
ubuntu-chromium 11.116726154016055% different from snapshot (102451.74823541196 differing pixels) [image: ubuntu-chromium-diff] https://user-images.githubusercontent.com/442527/88356290-4ef71d80-cd35-11ea-953b-2c6dfe551b96.png
ubuntu-firefox 0.4187165861719522% different from snapshot (3858.8920581607113 differing pixels) [image: ubuntu-firefox-diff] https://user-images.githubusercontent.com/442527/88356291-4f8fb400-cd35-11ea-88c9-d9c206bea37c.png
ubuntu-webkit 12.17770280391608% different from snapshot (112229.7090408906 differing pixels) [image: ubuntu-webkit-diff] https://user-images.githubusercontent.com/442527/88356292-50c0e100-cd35-11ea-9be1-8a9f2fa02727.png
windows-chromium 1.4968674877126276% different from snapshot (13795.130766759576 differing pixels) [image: windows-chromium-diff] https://user-images.githubusercontent.com/442527/88356323-67673800-cd35-11ea-86d3-f13d2514319b.png
windows-firefox 2.087652134020601% different from snapshot (19239.80206713386 differing pixels) [image: windows-firefox-diff] https://user-images.githubusercontent.com/442527/88356326-6930fb80-cd35-11ea-8c4d-e32ac23c121e.png
windows-webkit 26.405261426690828% different from snapshot (243350.8893083827 differing pixels) [image: windows-webkit-diff] https://user-images.githubusercontent.com/442527/88356329-6a622880-cd35-11ea-9cd7-3ac46ce54016.png
Thanks for taking a look at these, btw. Very much appreciated!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/americanexpress/jest-image-snapshot/issues/201#issuecomment-663325205, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKQJVXCDEQKVU7XG7AXSTR5DZVFANCNFSM4MPY3HVA .
1) Are you using the default ssim choice or 'fast'?
Default.
2) Are there any size mismatches and/or do you have allowSizeMismatch turned on?
Yes, allowSizeMismatch
is set to true
. I believe the width of the Windows screenshots was 1px less than the macOS screenshots they were being compared to which lead to enable this option.
3) How are you controlling for different versions of each browser on each platform?
I'm not. I've been operating under the assumption that Playwright uses the same browser version across operating systems based one the browsers/platform matrix shown on the repo's README.md:
First please try with ssim fast for comparison.
It's definited with an example in the readme. When you're done, send me the new diffs with it.
If that doesn't work, we probably need to fix the size mismatches. They can really mess up comparisons across the board even if it's only one pixel.
On Thu, Jul 23, 2020, 23:10 John Hildenbiddle notifications@github.com wrote:
- Are you using the default ssim choice or 'fast'?
Default.
- Are there any size mismatches and/or do you have allowSizeMismatch turned on?
Yes, allowSizeMismatch is set to true. I believe the width of the Windows screenshots was 1px less than the macOS screenshots they were being compared to which lead to enable this option.
- How are you controlling for different versions of each browser on each platform?
I'm not. I've been operating under the assumption that Playwright uses the same browser version across operating systems based one the browsers/platform matrix shown on the repo's README.md https://github.com/microsoft/playwright:
[image: Screen Shot 2020-07-24 at 1 04 35 AM] https://user-images.githubusercontent.com/442527/88362287-e8302f00-cd49-11ea-924d-0864c2993249.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/americanexpress/jest-image-snapshot/issues/201#issuecomment-663349189, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKQJSJQHVGIXAIOAA3USTR5EJTDANCNFSM4MPY3HVA .
Here are the snapshot diff statistics and images.
TL;DR: using ssim: 'fast'
slightly increases the diff percentage, which would be my expectation given that fast
is slower-but-more-accurate than the default bezkrovny
setting.
{
allowSizeMismatch: true, // Windows CI fix
comparisonMethod: 'ssim',
customDiffConfig: {
ssim: 'fast',
},
customSnapshotIdentifier(data) {
return `${data.defaultIdentifier}-${browserName}`;
},
diffDirection: 'vertical',
failureThreshold: 0.01,
failureThresholdType: 'percent',
noColors: true,
runInProcess: true, // macOS CI fix
}
Regarding allowSizeMismatch
, this is required only for our Windows screenshots which for some reason have a width 1px less than the reference screenshots. I'm less worried about this at the moment because the Windows diff statistics show either a low percentage difference (chromium-diff = 1.7%, firefox-diff = 2.7%) or diff percentages that are in line with ubuntu (ubuntu-webkit-diff = 12.9%, windows-webkit-diff = 13.9%) indicating that the 1px size mismatch isn't a huge issue.
Same as the ones posted in https://github.com/americanexpress/jest-image-snapshot/issues/201#issuecomment-663325205.
ubuntu-chromium-diff
12.125042040334522% different from snapshot (111744.38744372295 differing pixels)
vs. 11.116726154016055% using bezkrovny
ubuntu-firefox-diff
Successful match using SSIM w/ failureThreshold: 0.01
and ! 🥳
vs. 0.4187165861719522% using bezkrovny
ubuntu-webkit-diff
12.9088979174103% different from snapshot (118968.40320685333 differing pixels)
vs. 12.17770280391608% using bezkrovny
windows-chromium-diff
1.6544634425014748% different from snapshot (15247.535086093592 differing pixels)
vs 1.4968674877126276% using bezkrovny
windows-firefox-diff
2.378168241687295% different from snapshot (21917.19851539011 differing pixels)
vs. 2.087652134020601% using bezkrovny
windows-webkit-diff
NOTE: The font-rendering issue that caused the unusually high diff percentage (26.4%) in windows-webkit-diff screenshot in the previous post has been fixed by using playright@next
. Therefore, the diff statistics below should not be compared to those numbers (i.e switching to ssim: 'fast'
did not reduce the diff percentage from 26.4% to 13.89%).
13.891127611529564% different from snapshot (128020.63206785645 differing pixels)
vs. 26.405261426690828% using bezkrovny
with font-rendering bug
@omnisip --
Adding one more diff that may be useful. Specifically, these are small snapshots taken using the same browser (Webkit) on different platforms (macOS & Windows) that result in a high diff percentage. This is easily addressed by increasing the failureThreshold
value to any value greater than the diff percentage (e.g. 0.16
) but then test accuracy is lost.
macOS Webkit Snapshot
Windows Webkit Snapshot
Diff 15.603094816705564% different (1446.250858560439 differing pixels)
Perhaps my hopes are unrealistic, but I was hoping ssim would allow me to handle relatively small structural image differences like these.
If what I saw in the larger images is also happening to these small ones, the tests are properly failing.
Wait until I get back to my desk and I'll show you want I mean.
On Fri, Jul 24, 2020, 10:21 John Hildenbiddle notifications@github.com wrote:
Adding one more diff that may be useful. Specifically, these are small snapshots taken using the same browser (Webkit) on different platforms (macOS & Windows) that result in a high diff percentage. This is easily addressed by increasing the failureThreshold value to any value greater than the diff percentage (e.g. 0.16) but then test accuracy is lost.
macOS Webkit Snapshot
[image: macos-webkit-snap] https://user-images.githubusercontent.com/442527/88412203-30316f00-cda7-11ea-99d7-564fc23b7550.png
Windows Webkit Diff 15.603094816705564% different from snapshot (1446.250858560439 differing pixels) [image: windows-webkit-diff] https://user-images.githubusercontent.com/442527/88412225-3a536d80-cda7-11ea-9291-d86f3d7a3ec8.png
Perhaps my hopes are unrealistic, but I was hoping ssim would allow me to handle relatively small structural image differences like these.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/americanexpress/jest-image-snapshot/issues/201#issuecomment-663619596, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKQJR2LTSPPPGVPXSVFQDR5GYIPANCNFSM4MPY3HVA .
Sounds good. Thanks, @omnisip.
FWIW, switching to the following pixel-based comparison settings allows the smaller "Docsify Test" image comparison to pass:
customDiffConfig: {
threshold: 0.3,
},
failureThreshold: 0.04,
I can twiddle the knobs on both pixel- and ssim- based comparisons to get tests to pass (knowing that doing so is less ideal than testing on a single OS using a docker container). The challenge for me is understanding which comparison method is the better choice once the "good enough to pass tests" threshold(s) are set. The best I can do is review the pixelmatch demo and the SSIM playground and try to judge for myself. As stated earlier, my hope (which was perhaps unrealistic) was that SSIM would provide a clear and significant advantage over pixel-based comparisons for scenarios like mine (same content, slight differences in text rendering). It appears that this isn't the case, which could lead others to be equally confused about which comparison method they should pick. My assumption would be that many users will latch on to the "reduced false positives" claim and opt for ssim
without any real understanding of if/how it is a better option. Just my $0.02.
See below:
If you look closely, those aren't exactly minor structural differences. Look how far off the 's's are in both words and the offset in the T.
However, what you're doing isn't wrong. You're trying to determine how close one platform's version of a screen is to another -- so the algorithm may not be tuned properly for this comparison.
If you're willing to experiment, you can adjust the search window size for the SSIM library. This is the search bounding boxes of pixels (NxN) it uses to calculate the changes in each block of the screen. The default is 11px.
An example configuration then would be -- { ssim: 'fast', windowSize: 24} -- to try the regular ssim algorithm with a 24x24 pixel window.
On Fri, Jul 24, 2020 at 4:28 PM Dan Weber dweber@gmail.com wrote:
If what I saw in the larger images is also happening to these small ones, the tests are properly failing.
Wait until I get back to my desk and I'll show you want I mean.
On Fri, Jul 24, 2020, 10:21 John Hildenbiddle notifications@github.com wrote:
Adding one more diff that may be useful. Specifically, these are small snapshots taken using the same browser (Webkit) on different platforms (macOS & Windows) that result in a high diff percentage. This is easily addressed by increasing the failureThreshold value to any value greater than the diff percentage (e.g. 0.16) but then test accuracy is lost.
macOS Webkit Snapshot
[image: macos-webkit-snap] https://user-images.githubusercontent.com/442527/88412203-30316f00-cda7-11ea-99d7-564fc23b7550.png
Windows Webkit Diff 15.603094816705564% different from snapshot (1446.250858560439 differing pixels) [image: windows-webkit-diff] https://user-images.githubusercontent.com/442527/88412225-3a536d80-cda7-11ea-9291-d86f3d7a3ec8.png
Perhaps my hopes are unrealistic, but I was hoping ssim would allow me to handle relatively small structural image differences like these.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/americanexpress/jest-image-snapshot/issues/201#issuecomment-663619596, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKQJR2LTSPPPGVPXSVFQDR5GYIPANCNFSM4MPY3HVA .
Picture reupload for viewers / commenters --
Thank you for the excellent feedback, @omnisip. SSIM's window
option looks interesting, so I'll continue to experiment.
Definitely learned a few new things along the way, so thank you for your time and effort. Very much appreciated.
I saw a significant increase in failures with SSIM. Images that previously beat a threshold of 0.08 are now in the 0.15-0.25 range. I'll try increasing the window size before I increase the threshold to 3%. I'd be pretty worried about false negatives at that level.
Yeah that's pretty high. Want to join https://one-amex.slack.com/ to discuss more? There is a #jest-image-snapshot channel there.
Can you send us before, after, and diff?
On Tue, Aug 4, 2020, 18:05 l-abels notifications@github.com wrote:
I saw a significant increase in failures with SSIM. Images that previously beat a threshold of 0.08 are now in the 0.15-0.25 range. I'll try increasing the window size before I increase the threshold to 3%. I'd be pretty worried about false negatives at that level.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/americanexpress/jest-image-snapshot/issues/201#issuecomment-668890722, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKQJW4FD6HQBF4XSQS7G3R7CO5RANCNFSM4MPY3HVA .
Pixel to pixel comparison can fail frequently, but more often than not, it's because of minor variations in rendering and the compression algorithm. This issue was really common when I worked with designing live video streaming systems, so we switched away from PSNR models to something called SSIM (Structural Similarity). I believe if this library implements it, it will have significantly less false positives.
A well maintained javascript library exists for this today -- https://github.com/obartra/ssim . I would be happy to implement it into this software if I have the time.