bUnit-dev / bUnit

bUnit is a testing library for Blazor components that make tests look, feel, and runs like regular unit tests. bUnit makes it easy to render and control a component under test’s life-cycle, pass parameter and inject services into it, trigger event handlers, and verify the rendered markup from the component using a built-in semantic HTML comparer.
https://bunit.dev
MIT License
1.14k stars 105 forks source link

1.24.10 unit tests hang randomly when run in parallel #1244

Closed groogiam closed 11 months ago

groogiam commented 1 year ago

Describe the bug

After updated 1.24.10 unit tests will randomly hang on both my local machine and CI when run in parallel. The behavior only seems to manifest when I have full saturation of my CPU cores from running multiple tests classes in parallel. I have not been able to figure out a rhyme or reason why am at a bit of a loss on how to begin trouble shooting this. This was not happening with 1.22.19.

This may be related to https://github.com/bUnit-dev/bUnit/issues/1064.

Expected behavior:

Unit tests should not hang when run in parallel.

Version info:

egil commented 1 year ago

Im quite sure it is not related to the previous issue. However, one change we did make between 1.22 and 1.24 that could affect you is that we explicitly dispose the renderer first now before disposing any other services.

Try running your tests with --blame-hang-timeout 60s --blame-hang-dump-type full --blame-crash-dump-type full to get a dump when the hang happens. That dump can then be opened in Visual Studio and you can see where in the code there is a deadlock.

groogiam commented 1 year ago

I reverted to 1.23.9 and that seems to resolve the issue. It looks like the commit disposes the renderer.

https://github.com/bUnit-dev/bUnit/commit/d4d53d29c52598f5c93ed1806f71911a58dcd4ee.

Which was released after 1.23.9 so I'm guessing that is the culprit.

I'll run those diagnostic to see if it is something that can be handled in my code. If not I'll post more details. Thanks for the quick reply.

egil commented 1 year ago

If that is indeed the culprit, its an interesting timing issue I think. The renderer was also disposed previously, but not necessarily first. So the only change now is that the renderer is disposed before other services. When the renderer is disposed, I am pretty sure it will dispose of components it has rendered. So perhaps you have code in some of your components that is blocking that?

groogiam commented 1 year ago

Please see image below. It seems like this is hanging on framework code in the RenderTreeRender unless I'm not understanding how to navigate the crash dump file. If it helps I can share the crash dump file over One Drive privately. Thanks.

image

linkdotnet commented 1 year ago

@groogiam Does your CUT implement IAsyncDisposable and furthermore do you have any async disposable registered in the DI container?

groogiam commented 1 year ago

@linkdotnet I did a quick survey of my components and from what I can tell the only component or service in the Blazor stack that implements IAsyncDisposable is Toolbelt.Blazor.LoadingBar. I'll try disabling that service in my test to see if that helps with the issue.

egil commented 1 year ago

Cool. Keep us posted.

groogiam commented 1 year ago

Disabling the loading bar services did not seem to have any effect. It still hangs randomly; more frequently in the CI environment. I did maybe stumble upon another clue. The tests that seem to be failing do not use WaitForAssert. They tend to do the following

  1. Setup Service
  2. Setup Http Mocks
  3. RenderComponent
  4. WaitForState
  5. Assert Some Stuff without using WaitForAssertion

Adding WaitForAssertion seems to help, at least locally. I'm trying to verify that in my CI environment now.

groogiam commented 11 months ago

@egil @linkdotnet Do you guys have any other suggestions here? It does not seem like it should be required to have a wait for assertion on test that do a single render then assert. I'm not sure if I'm reading the crash dumps correctly but it does not seem to tell me what is actually causing the deadlock.

linkdotnet commented 11 months ago

There is a new pre-release - this one disposes components before the renderer itself (so a bit like 1.23 but not totally).

Could you try the latest pre-release?

groogiam commented 11 months ago

@linkdotnet The issue appears to be resolved in 1.26.4-preview. Thanks.

linkdotnet commented 11 months ago

Glad to hear - please let us know, if that behavior is now stable. If so, I would close the Ticket.

groogiam commented 11 months ago

It appears to be stable. I ran about 15 successful test runs through our CI server yesterday.