attempt number is added to screenshot name, causing retries of failures to inappropriately succeed

froodian commented 1 year ago

Describe the bug Similar to https://github.com/meinaart/cypress-plugin-snapshots/issues/217 (and the open pull that fixes it), cy.state('runnable')._currentRetry (as documented here) should be incorporated into the nameCacheCounter logic so that a retried test is not seen as a new test.

To Reproduce Steps to reproduce the behavior:

Have a test, with an accepted and committed baseline image
Have retries enabled in the cypress config (docs)
Change the code under test so that the test will go red
Run the test

Expected behavior Test should go red. Instead test goes red, then on retry it goes green because the plugin is generating an actual with #1.png in the name instead of #0.png and therefore treating it as a new test.

Screenshots

Please complete the following information:

Mac OS 12.6
Chrome 107
Cypress 10.8.0

Additional context Add any other context about the problem here.

github-actions[bot] commented 1 year ago

:tada: This issue has been resolved in version 3.1.0 :tada:

The release is available on:

Your semantic-release bot :package::rocket:

FRSgit commented 1 year ago

Hey @froodian,

First of all - thank you for such detailed issue! ❤️ I never have used Cypress retries option, so you description made it much, much easier for me to understand what's the issue and how it can be covered. I've just merged a PR that should cover your case (see #171). The implementation is slightly different than the one from cypress-plugin-snapshots (I do not use separated map for retries tracking), but I think it will do.

Please try the new release 3.1.0 and come back with a feedback (simple "it works!" is enough 😄 )

froodian commented 1 year ago

Hi @FRSgit - thanks for the fast fix!

It looks like with 3.1.0, it mostly works - but, if the first attempt fails for unrelated reasons (before the screenshot), the retry attempt creates Rendered IAM HTML Built Via Grapes DnD HTML IAM Composer Rendered IAM #-1.png (instead of #0 as would be expected), and so still goes green when the screenshot comparison should fail.

I think this could probably be fixed by making the code a little more like

  // it's a retry of the same image that was run previously, so let's decrease the counter
  if (currentRetryNumber > 0 && nameCacheCounter[screenshotPath] > -1) {
    --nameCacheCounter[screenshotPath];
  }

to make sure nameCacheCounter[screenshotPath] actually saw the image previously, so it doesn't go below -1?

FRSgit commented 1 year ago

Hey @froodian,

You're right - this condition is not enough. But after giving it a though solution proposed by you (and the implementation for cypress-plugin-snapshots) also does not cover all of the cases. For example this won't work when there were more than 2 screenshots done before test fail & retry, because we're now decreasing nameCacheCounter only by one. Also - what if names of screenshots done before retry were different?

I need to think about it for a moment, if you have any ideas - don't hesitate to post them here

github-actions[bot] commented 1 year ago

:tada: This issue has been resolved in version 3.1.1 :tada:

The release is available on:

Your semantic-release bot :package::rocket:

FRSgit commented 1 year ago

Just released 3.1.1 which should handle all of the cases above correctly. Please give it a go 🙏

froodian commented 1 year ago

Yes, I see what you're saying - thanks for the robust fix! It works for me!

FRSgit commented 1 year ago

Cool, thanks for retesting it! ✌️

mduivcap commented 1 year ago

If it's always #0, why do we even need a number? I just started to use this plugin but I noticed that suddenly during one of the runs #1 image was added and I don't know why (pretty new to Cypress and this plugin). I'm not aware of any retry that I configured. And I was confused since this is the base image (which should only exist once). I'm using:

node: v19.3.0
cypress: 12.17.1
cypress-plugin-visual-regression-diff: 3.3.10

Not sure how to reproduce this, since it only happened once. I was just playing around with the settings (threshold) and running the base against different browser. If I have a reproduce path I'll let you know, but I think this issue is not resolved completely yet.

gigaSproule commented 11 months ago

I'm seeing this issue still, but only with cypress open, not with cypress run. It seems that when a file diff is large enough, it creates a new screenshot, which seems to trigger a new run of the tests, which then pass.

It's odd, because if I run the test in a different browser, I get an image diff of 0.12, which fails as my max is set to 0.01 and that's ok. When I change something big (such as the background colour), the diff is 0.99, and that's when I'm seeing the multiple images. Weirdly, it gets up to #7 and then stops.

I've got retries set to 0 and watchForFileChanges set to false.

This is the same whether I run it directly on the mac or via docker.

node: v18.18.0 cypress: 13.2.0 cypress-plugin-visual-regression-diff: 3.3.10

Edit: It doesn't seem to be due to the diff size, it just seems to be more reproducible if the diff is larger. Even with the 0.12 diff failures, it will re-run occasionally and "pass".

Edit 2: I've also noticed that the test run counter is combining the number of tests run. So even though there are only 7 tests, when it re-runs twice, it shows 14 tests were run. If it re-runs three times, it shows 21, etc.

Edit 3: It seems to not have a limit, it's just run 14 times (of a spec with 7 tests) and then ended up with a "No tests found" message, but it's passed 53 tests and failed 33, which isn't even the full number of runs for 14 runs ((53+33) / 14 =~ 12).

Edit 4: I forgot to mention here, but I did a bit more digging and I think my issue is not related to this. Either my understanding is wrong, or there is a bug in the name counter cache (https://github.com/FRSOURCE/cypress-plugin-visual-regression-diff/discussions/300).

FRSOURCE / cypress-plugin-visual-regression-diff

attempt number is added to screenshot name, causing retries of failures to inappropriately succeed #168