Closed RussKie closed 3 years ago
/cc: @hughbe
Looks like the test deadlocks on itself, there are no other user-code executed:
When it looks like its "deadlocking on itself" it instead is usually sending a window message to a control without message loop.
SendMessageOnUserPreferenceChange seems to send a message to the desktop, which probably tries to notify all forms, so if there is any that has not been properly disposed and still hanging around without a message loop you'll probably see this hang.
You could try running the test under RemoteExecutor like the other one @hughbe recently moved, forgot which it was, something with visual styles I think. Though considering that its sending to Desktop it may actually broadcast across processes, so not entirely sure if RemoteExecutor helps here.
Alternatively figure out which control is not properly disposed, but that probably needs some tracking infrastructure.
[edit] might have misunderstood the comments, "desktop" might mean "Desktop Framework" and not the "Desktop HWND", going to check the implementation of SystemEvents
The SystemEvents class will "take over" the STA thread from which it is invoked (by creating a HWND on it that it never disposes) and expects it to live until the process terminates.
That means SystemEvents are incompatible with multiple UI threads UI threads that terminate early, you'll have to test SystemEvents in a RemoteExecutor.
Alternatively, if you could force SystemEvents to be first accessed on an MTA thread, then it would create and maintain its own STA thread and not cause failures. But I don't see how you can easily enforce that the first access to SystemEvents happens on MTA.
Thank you.
Alternatively figure out which control is not properly disposed,
That's probably not too difficult - throw an exception in ~Control
:) Though I think we could drown in those.
You could try running the test under RemoteExecutor
The trouble with the RE - https://github.com/dotnet/arcade/issues/5325 So while it is runnable locally, it doesn't appear to be runnable on the build agents...
The SystemEvents class will "take over" the STA thread from which it is invoked and expects it to live until the process terminates.
That means SystemEvents are incompatible with multiple UI threads, you'll have to test SystemEvents in a RemoteExecutor.
Alternatively, if you could force SystemEvents to be first accessed on an MTA thread, then it would create and maintain its own STA thread and not cause failures. But I don't see how you can easily enforce that the first access to SystemEvents happens on MTA.
Wow, thank you!
That's probably not too difficult - throw an exception in ~Control :) Though I think we could drown in those.
Yeah and it doesn't help here either way, the HWND which deadlocks doesn't inherit from Control, SystemEvents has its own implementation.
If it deadlocks on a undisposed control, wouldn't we just need to catch those via the finalizer calls? Or is there something else that you have in mind?
Its a native control (HWND) which is created for message dispatching purposes (not visible) - it doesn't inherit from WinForms Control (SystemEvents doesn't seem to be part of WinForms, had to inspect the assembly in ILSpy to see what it was doing, the source doesn't seem to be indexed on source.dot.net either). The HWND is created the first time someone accesses SystemEvents, it checks the apartment. If its STA it creates the HWND on the current thread, otherwise it creates a separate STA thread.
So if your first invocation of SystemEvents is on a WinFormsFact then SystemEvents will create its broadcast HWND on that thread, but the thread is not going to live until process exit, so any message sent to SystemEvents by anything else than the very first WinFormsFact will hang.
FYI it is implemented here: https://github.com/dotnet/runtime/blob/master/src/libraries/Microsoft.Win32.SystemEvents/src/Microsoft/Win32/SystemEvents.cs
Would a fix be to change the WinFormsFact
to be Fact
?
FYI it is implemented here
Thanks, the line I'm referring to is here. Uninitialization only happens on AppDomain (on Desktop) or Process shutdown, obviously neither is going to happen when your UI threads finish their test.
Would a fix be to change the WinFormsFact to be Fact?
Probably, but you need to enforce that on every usage of SystemEvents (including indirect usages). I don't know if WinForms itself depends on SystemEvents, but if the first invocation happens indirectly through Application.Run or a Form constructor or something similar you still are flaky since you depend on the "right" test to be executed first.
Running it in RemoteExecutor seems to be the safer thing to do to make this particular test reliable.
We had seen this bug in the .NET framework, in Watson crash dumps and from customer service. Here is a small repro in the .Net Framework. Would be good to have it fixed.
using System;
using System.Threading;
using System.Threading.Tasks;
using System.Windows.Forms;
namespace TwoStaThreads
{
static class Program
{
[STAThread]
static void Main()
{
Thread staThread = new Thread(NonAlwaysPumpingThreadProc)
{
IsBackground = true
};
staThread.SetApartmentState(ApartmentState.STA);
staThread.Start();
Application.Run(new Form
{
Text = "Main UI Thread"
});
}
static void NonAlwaysPumpingThreadProc()
{
Form f1 = new Form
{
Text = "non-pumping thread - close me",
Size = new System.Drawing.Size(600, 100)
};
f1.ShowDialog();
// at this point we are no longer pumping messages,
// but the thread and it's message only marshaling window is alive
while (true)
{
Thread.Sleep(1000);
}
}
}
}
Just for the record, your snippet has a bug, a form that is displayed with ShowDialog
needs explicit disposal, you only get automatic disposal when calling Show
. That is because after ShowDialog
the user usually wants to read out dialog results and if the handles are already gone it may not be possible to read all entered data anymore.
I don't think you can do anything if a STA thread decides to no longer pump messages, this is basically a bug in user code the user has to fix. Any COM call on an STA object created on this thread would hang as well, so this problem is not .NET specific. Its a general problem and the users fault, marking a thread STA means it needs to pump messages or be shut down.
The design issue with SystemEvents is that there is no infrastructure to uninitialize itself when a UI thread shuts down (which is different from stopping to pump messages). It is designed assuming there is only one UI thread and it will live to the end of the process/app domain lifetime.
This is something we've gotten user feedback on (CSS & Watson).
This could be fixed in this test, but it would be even better to re-design how SystemEvents are initialized.
would have to open issue in dotnet/runtime. @weltkante you're an infinitely busy guy but if could open an issue, would be awesome since you know everything!
Created an issue, not sure I can do much more, basically just analyzed the stacktrace and did some guesswork. The cleanest solution would be if thread shutdowns were observable, then it could automatically unregister the HWND before the thread is gone.
~[edit] apparently they moved my issue back to winforms, waiting for this to be resolved~
This has now been fixed in the runtime for .NET 6
Thank you ;)
Problem description:
After #3226 was merged,
ProfessionalColorTable_ChangeUserPreferences_GetColor_ReturnsExpected
tests started deadlocking in x86 mode that caused CI builds to fail again. I managed to reproduce the deadlock locally, though it took few attempts to do so.Looks like the test deadlocks on itself, there are no other user-code executed:
Expected behavior:
The tests work as expected.
Minimal repro:
A repro is bit convoluted. Unfortunately
build.cmd -test -platform x86
command doesn't appear to work unlessWinforms.sln
is configured for x86 platform (which breaks other modes).