fix: tests fails on CI while using the default config

pedromassango commented 2 months ago

Is there an existing issue for this?

[X] I have searched the existing issues.

Version

0.7.0

Description

Test fails in CI with the default configuration, it works locally with flutter run test

Config:

Future<void> testExecutable(FutureOr<void> Function() testMain) async {
  TestWidgetsFlutterBinding.ensureInitialized();
  const isRunningInCi = bool.fromEnvironment('CI');

  return AlchemistConfig.runWithConfig(
    config: AlchemistConfig.current().copyWith(
      // Using a dark background to better visualize the hover state all components
      theme: ThemeData.dark(),
      platformGoldensConfig: const PlatformGoldensConfig(
        // ignore: avoid_redundant_argument_values
        enabled: !isRunningInCi,
      ),
    ),
    run: testMain,
  );
}

Steps to reproduce

use the setup code above, generate any golden test and make it run on CI with a linux env, see CI failing with:

══╡ EXCEPTION CAUGHT BY FLUTTER TEST FRAMEWORK ╞════════════════════════════════════════════════════
The following StateError was thrown running a test:
Bad state: No element

When the exception was thrown, this was the stack:
#0      ListBase.singleWhere (dart:collection/list.dart:167:5)
#1      _TestOptimizationAwareGoldenFileComparator.getTestUri (file:///home/runner/work/library-xyz/library-xyz/test/.test_optimizer.dart:65:10)
#2      MatchesGoldenFile.matchAsync (package:flutter_test/src/_matchers_io.dart:63:50)
#3      _expect (package:matcher/src/expect/expect.dart:109:26)
#4      expectLater (package:matcher/src/expect/expect.dart:73:5)
#5      expectLater (package:flutter_test/src/widget_tester.dart:517:25)
#6      defaultGoldenFileExpectation.<anonymous closure>.<anonymous closure> (package:alchemist/src/golden_test_adapter.dart:41:35)
#7      FlutterGoldenTestAdapter.withForceUpdateGoldenFiles (package:alchemist/src/golden_test_adapter.dart:200:28)
#8      FlutterGoldenTestRunner.run (package:alchemist/src/golden_test_runner.dart:106:33)
00:14 +63 -1: XyzButton.primaryLarge default state (variant: Linux)
<asynchronous suspension>
#9      goldenTest.<anonymous closure> (package:alchemist/src/golden_test.dart:169:7)
<asynchronous suspension>
#10     testWidgets.<anonymous closure>.<anonymous closure> (package:flutter_test/src/widget_tester.dart:189:15)
<asynchronous suspension>
#11     TestWidgetsFlutterBinding._runTestBody (package:flutter_test/src/binding.dart:1032:5)
<asynchronous suspension>
<asynchronous suspension>
(elided one frame from package:stack_trace)

The test description was:
  disabled state (variant: Linux)

Expected behavior

test should pass

Screenshots

No response

Additional context and comments

No response

btrautmann commented 1 month ago

@pedromassango could you provide a sample (failing) test? I assume that it's passing locally? If so, what platform are you running locally?

pedromassango commented 1 month ago

@pedromassango could you provide a sample (failing) test? I assume that it's passing locally? If so, what platform are you running locally?

It is a private project, you can reproduce with a simple golden test using alchemist

btrautmann commented 1 month ago

@pedromassango could you provide a sample (failing) test? I assume that it's passing locally? If so, what platform are you running locally?

It is a private project, you can reproduce with a simple golden test using alchemist

Could you answer the other questions I had? Thank you!

pedromassango commented 1 month ago

MacOS

Sorry, I missed this thread

btrautmann commented 1 month ago

I looked into this today and was not able to reproduce. I have a sample repo here.

The test is here.

A successful CI run was here. In case you're not able to see that, I've attached a screenshot of the run below.

Screenshot 2024-09-23 at 12 05 29

Of note, the CI goldens were produced on CI itself via this workflow. This has proven to be very reliable for us at Betterment to avoid any oddities in the way Flutter UI renders on different platforms, even when text is accounted for via the Ahem font. As noted in other tickets, sometimes half-pixel render differently depending on the platform; it's recommended to generate your CI goldens on CI.

I'm going to close this as I don't think there's an action item for us, but please do feel free to open again if you have a reproducing sample or any questions. Thank you!

pedromassango commented 1 month ago

@btrautmann thank you, seems the only way to make the CI test to work.

How to you make sure a new PR won't break a Widget? Will the CI fails if the image differs from the existing one?

btrautmann commented 1 month ago

How to you make sure a new PR won't break a Widget? Will the CI fails if the image differs from the existing one?

Correct, testing a Widget for something like a PR is all business as usual, the test will get run on CI (and therefore the test image is generated by CI), comparing the test image against the master image. That's why it makes sense (and is most reliable) to generate master images on CI itself.

pedromassango commented 1 month ago

That is amazing, I couldn't really find this information anywhere else, so just to make sure I got this right:

The flow is:

Dev adds a new feature/fix with golden tests
- No golden tests provided
  - do nothing
- Golden tests added
  - PR reviewer verifies that implementation is correct
  - PR gets approved & merged
  - CI generates golden files and commits them
- Next time a PR is opened...
  - Github actions generate Goldens and compare with existing ones
    - Test fails if image differs, dev makes necessary changes... cycle repeats
    - Test passes, new images are committed & stored

Is that the flow I should aim for & you are using internally? I am just trying to get an understanding of this, and than for the insights

btrautmann commented 1 month ago

@pedromassango yeah you are correct, that's pretty much the workflow. The bit there that isn't done for you out of the box is the generation of goldens via CI. There is a bit of discussion around the matter here, but nothing is currently planned at this time to make this automatic.

At Betterment, our general flow is:

Goldens are generated via a manually invoked workflow (kicked off via GHA workflow dispatch GUI, pointed at the feature branch, e.g bt/my-new-widget. This workflow runs flutter test --update-goldens and commits the changes to that branch.
On every PR we run our golden tests normally via flutter test.
Our most common use case for re-generating goldens is during Flutter upgrades if we are impacted by any upstream Flutter changes. We use the same workflow from above to generate new goldens after inspecting the changes caused by the upgrade to ensure that they are minimal/trivial and not worth addressing.

From our standpoint, we don't really even use platform goldens except for local validations.

PS: Much of this is mentioned in the setup guide, worth a read!

pedromassango commented 1 month ago

That is nice.

Goldens are generated via a manually invoked workflow (kicked off via GHA workflow dispatch GUI, pointed at the feature branch, e.g bt/my-new-widget. This workflow runs flutter test --update-goldens and commits the changes to that branch.

Would be nice to have an automation for this too.

Anyway, thans for sharing, I believe now it is clear in what worflow to have with golden tests

pedromassango commented 1 month ago

Goldens are generated via a manually invoked workflow (kicked off via GHA workflow dispatch GUI, pointed at the feature branch, e.g bt/my-new-widget. This workflow runs flutter test --update-goldens and commits the changes to that branch.

Hi @btrautmann I have another question, does that means, for the first time, I should first open a PR with the golden tests & flutter test --generate-goldens and then run tests without the generate flag?

My current setup is failing because the first images were generated on my laptop (macOs) and during CI the comparison fails

Betterment / alchemist