happo / happo-plugin-storybook

A happo.io plugin for Storybook
MIT License
11 stars 5 forks source link

Unpredictable diffs in CI #82

Closed BCerki closed 11 months ago

BCerki commented 1 year ago

Good afternoon, thanks again for this useful plugin!

When I run happo in CI, I'm having some trouble with unpredictable diffs. For example, this PR only includes a readme change. There are no changes to any of the components, but I'm getting diffs. Some appear spurious (or at least, I don't see any visual difference). In other screenshots, whole components appear or disappear. I also see diffs, even though the PR doesn't touch the affected component.

It's not always the same components or browsers that create these three types of diffs.

Help much appreciated, thanks so much!

trotzig commented 1 year ago

Hi @BCerki! I believe there are a few root causes of flakiness here: A) Errors when rendering B) Dates relative to "now" C) Browser rendering differences D) Randomness in the story

Let me spend some time to discuss these issues one by one in subsequent comments.

trotzig commented 1 year ago

Errors when rendering

Here's a render error diff from one of the links you provided.

Screenshot 2022-11-04 at 11 37 45

When a component/story fails to render, it will output it's stack trace in the UI and Happo will take a screenshot of it. An issue with this is that the local server port is part of the stack trace, and that port is somewhat random across different Happo workers. We can look into cleaning out these errors to make them more consistent but the root cause here is that the component throws an error on render. Are you able to reproduce this locally?

trotzig commented 1 year ago

Dates relative to "now"

Here's an example of this type of diff:

Screenshot 2022-11-04 at 12 06 03

This is quite a common issue with screenshot testing. When you have dates/times shown in the UI, the before and after screenshots can look different just because they were taken at different times. The best way to resolve this is to allow the component to have the date or time displayed passed in via props. In some cases, that means adding a "now" prop that by default is evaluated to new Date() but can be overridden in Storybook stories. Is that an option you can pursue here?

trotzig commented 1 year ago

Browser rendering differences

Here's an example of a browser rendering difference:

Screenshot 2022-11-04 at 12 07 36

These are most common in Chrome and usually only happen when you are using background images. The differences are impossible to spot manually but Happo will find even tiny shading differences and display them as a diff. The best thing here is to enable diff thresholds so that subtle differences are automatically ignored by Happo. I went ahead and set the threshold to 0.05 on each of your projects:

Screenshot 2022-11-04 at 12 10 13

A value of 0.05 is aggressive enough to catch and auto-ignore the most common rendering differences, while still not allowing small changes to slip through. As an example, the max color difference of the diff shown at the top in this comment was 0.00042 (shown at the bottom right).

From now on, you shouldn't see these types of diffs anymore. 🙂

trotzig commented 1 year ago

Randomness in the story

I had a look at the Grid stories and noticed that there is a seemingly random height applied to the columns on render.

Screenshot 2022-11-04 at 12 14 48

Is there something in here that uses a random height applied to columns? If so, the fix might be to make that "randomness" deterministic. It could mean using a seed when computing the random number or hard-coding a list of pre-randomized numbers to use.

trotzig commented 1 year ago

It seems I forgot to address one last type of diff:

Screenshot 2022-11-04 at 12 22 55

This looks like a font-rendering issue. In this particular case the text wrapped differently on the before and after screenshot. My best theory here is that the before report was generated on a different worker than the after report, and that they have two versions of a font (I believe this is Lato) installed. I'm going to look into this a little bit closer, we have done some infrastructure changes on our iOS workers and it's possible that one or a few instances have a slightly different setup than others.

The good thing is that most of the font rendering issues will go away with the diff threshold change I've applied. The only one that won't get auto-ignored from that is the one where the text wraps differently.

trotzig commented 1 year ago

By the way, this is great input! Let's keep this thread open for a while. If you can continue reporting flaky/spurious diffs here I'll look into them one by one. Together I think we can get to flakiness zero! 🎉

BCerki commented 1 year ago

Wow, thanks so much for such a detailed response! I'll address your comments in separate threads too:

Errors when rendering

Here's a render error diff from one of the links you provided. Screenshot 2022-11-04 at 11 37 45 When a component/story fails to render, it will output it's stack trace in the UI and Happo will take a screenshot of it. An issue with this is that the local server port is part of the stack trace, and that port is somewhat random across different Happo workers. We can look into cleaning out these errors to make them more consistent but the root cause here is that the component throws an error on render. Are you able to reproduce this locally?

Yes, I can reproduce this one locally and it's the list to address

BCerki commented 1 year ago

Browser rendering differences

Here's an example of a browser rendering difference: Screenshot 2022-11-04 at 12 07 36

These are most common in Chrome and usually only happen when you are using background images. The differences are impossible to spot manually but Happo will find even tiny shading differences and display them as a diff. The best thing here is to enable diff thresholds so that subtle differences are automatically ignored by Happo. I went ahead and set the threshold to 0.05 on each of your projects: Screenshot 2022-11-04 at 12 10 13

A value of 0.05 is aggressive enough to catch and auto-ignore the most common rendering differences, while still not allowing small changes to slip through. As an example, the max color difference of the diff shown at the top in this comment was 0.00042 (shown at the bottom right).

From now on, you shouldn't see these types of diffs anymore. slightly_smiling_face

Amazing, thank you!

BCerki commented 1 year ago

Dates relative to "now"

Here's an example of this type of diff: Screenshot 2022-11-04 at 12 06 03

This is quite a common issue with screenshot testing. When you have dates/times shown in the UI, the before and after screenshots can look different just because they were taken at different times. The best way to resolve this is to allow the component to have the date or time displayed passed in via props. In some cases, that means adding a "now" prop that by default is evaluated to new Date() but can be overridden in Storybook stories. Is that an option you can pursue here?

Yes, can pursue this option