Flakiness with MSW and Testing Library on CircleCI

SimonGodefroid commented 1 year ago

Hello everyone, not sure where to post this, it didn't belong anywhere else in other types.

We have a code base with a relatively big test base. We use Jest and tests are written for most part as Testing Library for React consuming either jest mocks (that we are progressively removing in favor of MSW) and MSW handlers.

I write this support question to know whether there's any recommendations on how to run tests on the CI. We use CircleCI and lately the more we introduced RTL + MSW, even though we try to follow the best practices around await findBy*, waitForElementToBeRemoved and so on. We do have some cruft but we're progressively fixing these issues.

I have been trying for the past couple of days to isolate msw + rtl tests (which are mostly located under the same folder) in order to try several pipeline setups.

Avoid parallelization of RTL + MSW: This seem to have reduced the "random" flakiness, meaning that more often than not it's the same tests failing
Increase resource on CI: This didn't change anything, we'd have a slightly faster pipeline execution but with a constant parallelism set to 5, we get the same amount of random flakiness.
Increase/reduce parallelism: Increasing parallelism up to 10 and 12 makes up for faster execution but same behavior and reducing it to 1 means way slower execution but more success.

At the moment (experiments aside) we're circa 50% of failure for tests just out of flakiness. Locally everything works fine it seldom flakes but on pipeline it's very very frequent.

We observed the CircleCi recommendations on --runInBand and we split by time but it seems that we reached the limits of this.

I wish I could share more data, at the moment I'm experiencing by running pipelines with config tweaks to identify a pattern. From the top of my head I would've said that parallelism unfortunately does not look to work well with RTL + MSW but I'd like to know whether it's something someone else experienced.

nathanhannig commented 1 year ago

I have had to increase my timeout limits for RTL, I think Jest and Node have memory leaks after version 14 that are degrading performance. I also have seen more recent versions of RTL having slowness which I haven't been able to pinpoint.

eps1lon commented 1 year ago

It's unclear where the issue is and it probably only emerges in the full setup, making it impossible to tackle here.

I recommend you check out our community discord and see if someone is able to help you.

testing-library / dom-testing-library

Flakiness with MSW and Testing Library on CircleCI #1167