dotnet / wpf

WPF is a .NET Core UI framework for building Windows desktop applications.
MIT License
7.08k stars 1.17k forks source link

WPF Builds fail intermittently due to timeouts #952

Closed vatsan-madhavan closed 5 years ago

vatsan-madhavan commented 5 years ago

WPF builds fail intermittently due to timeouts when Run Developer Regression Tests on Helix Machine stage fails to start within its timeout period (60 mins).

Example of a failed build: https://dev.azure.com/dnceng/internal/_build/results?buildId=224164

=

_TestHelixAgentPool in eng\pipeline.yml is currently set to Windows.10.Amd64.Open%3bWindows.7.Amd64.Open%3bWindows.10.Amd64.ServerRS5.Open

It should be changed to Windows.10.Amd64.Open%3bWindows.7.Amd64.Open%3bWindows.10.Amd64.Client19H1.Open.

In short, stop running tests on ServerRS5. It's a "server core" machine machine, and seems to have problems running tests right now. Change it to Client19H1.

vatsan-madhavan commented 5 years ago

/cc @MattGal fyi. If you have an issue in core-eng to track fixup of the ServerRS5 pool, please link let us know. We may want to use it again in near-future.

vatsan-madhavan commented 5 years ago

@merriemcgaw, @zsd4yr, are you testing on Windows Server Core SKU's ?

MattGal commented 5 years ago

@vatsan-madhavan I think the machines are actually OK they just take an insane amount of time to provision (likely related to having IIS, SQL server, Git, JDK, and Docker on them?). I filed https://github.com/dotnet/core-eng/issues/6710 but closed it.

My comments before remain; there's literally no reason (though it might work) to run WPF apps on a Server RS5 machine.

ryalanms commented 5 years ago

For reference, here is the MUX configuration file:

https://github.com/microsoft/microsoft-ui-xaml/blob/master/build/AzurePipelinesTemplates/MUX-RunHelixTests-Job.yml

vatsan-madhavan commented 5 years ago

My comments before remain; there's literally no reason (though it might work) to run WPF apps on a Server RS5 machine.

It's a long discussion, but there may be good reasons to run tests on Server Core editions. Perhaps not just yet though.

Our customers have been asking for official support on server core for a while, and most of WPF "works" on Server Core (and even WinPE!) reasonably well - and has done so for long time now. We don't offer official (or any) support on those OS editions because we ourselves don't have a clear picture of all the failure-modes that are possible, and haven't done any work to fail gracefully, harden the product etc. That said, running tests on server core is certainly not a "literally no reason" type of scenario :-) There is even a Server Core App Compatibility Feature On Demand package, BTW.

vatsan-madhavan commented 5 years ago

The PR build to fix this also failed for the same reason :|

https://dev.azure.com/dnceng/public/_build/results?buildId=224573

vatsan-madhavan commented 5 years ago

@MattGal, I tried again and the build finished after around 46 mins. Our tests should really only need a few minutes to run (<5, IIRC). Is there anything we can do to make this quicker - esp. not run up so close to the timeout?

https://dev.azure.com/dnceng/public/_build/results?buildId=224573

https://mc.dot.net/#/user/dotnet~2Fwpf/dotnet~2Fwpf~2Frefs~2Fpull~2F953~2Fmerge/tests~2Fdrt/20190613.18

merriemcgaw commented 5 years ago

@AdamYoblick what are we running our tests on? We're also running into intermittent timeout issues IIRC, right?

AdamYoblick commented 5 years ago

Our timeouts are something different that I’m looking at as we speak. They have to do with some calls to CreateHandle in our unit tests that open exception windows that never close.


From: Merrie McGaw notifications@github.com Sent: Friday, June 14, 2019 1:23:44 PM To: dotnet/wpf Cc: Adam Yoblick; Mention Subject: Re: [dotnet/wpf] WPF Builds fail intermittently due to timeouts (#952)

@AdamYoblickhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAdamYoblick&data=02%7C01%7CAdam.Yoblick%40microsoft.com%7C2305e02471d1441c165008d6f0f56f86%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636961334266431652&sdata=dxRVEfeKtbmxUObfrfr3iQgCIwAJZLD2oRBjCUvk52g%3D&reserved=0 what are we running our tests on? We're also running into intermittent timeout issues IIRC, right?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdotnet%2Fwpf%2Fissues%2F952%3Femail_source%3Dnotifications%26email_token%3DAHASVDNDKC6EUCUHIZ3IB2DP2PO3BA5CNFSM4HYB47NKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXXS6VA%23issuecomment-502214484&data=02%7C01%7CAdam.Yoblick%40microsoft.com%7C2305e02471d1441c165008d6f0f56f86%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636961334266441645&sdata=yCi4Bi3DQPSAXET6TxzIqsiytW1OhSPLluk9SIw8NNo%3D&reserved=0, or mute the threadhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAHASVDITCZLE3STBLJGIIDTP2PO3BANCNFSM4HYB47NA&data=02%7C01%7CAdam.Yoblick%40microsoft.com%7C2305e02471d1441c165008d6f0f56f86%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636961334266441645&sdata=v1nzkiH4Bm5MlYSpFfV5Uf4VIWlyKB70uJeZSXvEH3Q%3D&reserved=0.

vatsan-madhavan commented 5 years ago

Reopening this as the timeouts didn't get fixed

vatsan-madhavan commented 5 years ago

Related: https://github.com/dotnet/core-eng/issues/6608