Open AArnott opened 4 years ago
Pipes is part of IO. Pipelines has its own label.
@AArnott if there's a way to set up a repro that would probably make it more likely we can investigate this soon.
These folks have the repro: @genlu @RyanToth3 @tmat Can one of you share it?
@RyanToth3 I think you have a simpler repro with ServiceHub tests? Could you please share that? Otherwise, following this instruction would repro it. (basically this runs Roslyn OOP on .NetCore servicehub)
Go here https://devdiv.visualstudio.com/DevDiv/_packaging?_a=feed&feed=VSIDE-RealSigned-Release%40Local, and search for “microsoft.servicehub.host.clr”.
Download the host nuget package. The newest version of the package should be fine as long as you’re using a recent-ish build of VS.
Unzip the package and under the tools directory, copy the coreClr directory to your VS install under Common7/ServiceHub/Hosts.
Open Common7\ServiceHub\Hosts\coreClr\ServiceHub.Host.CLR.runtimeconfig.json, and change it to targeting 3.1
{
"runtimeOptions": {
"tfm": "netcoreapp3.1",
"framework": {
"name": "Microsoft.NETCore.App",
"version": "3.1.0"
}
}
}
Install this Roslyn vsix https://microsoft-my.sharepoint.com/:u:/p/gel/Ea411yRcX31Pq3P331UR2L0B28X3TIbHKxt1WW7Ywc3VNw?e=oDXjqW
Set these environmental variables to control the behavior of Roslyn remote service:
Launch VS, open a C# project and wait a bit for servicehub to spin-up.
Here's an app that repros the issue: https://microsoft-my.sharepoint.com/:u:/p/rytoth/EfzfFjfQd8FLu2_dUrdoCpUBxc0LQUV8Raij9OsIgJeyuw
Just run "dotnet ServiceHub.Sample.NugetClientApp.dll" nd you should see the error printed to the console.
Perhaps a race condition between completion of the async I/O and its cancellation?
That doesn't agree with the frequency you are seeing this.
I see the code you pointed out only checks if the handle IsInvalid
not closed. I see that the ReleaseHandle method doesn't actually clear the handle:
https://github.com/dotnet/runtime/blob/6072e4d3a7a2a1493f514cdf4be75a3d56580e84/src/libraries/System.IO.Pipes/src/Microsoft/Win32/SafeHandles/SafePipeHandle.Windows.cs#L16
If I had to guess this isn't a race with completion, it's just that you are calling cancel on pipe that was disposed. I haven't dug into the dumps or repro yet, but that's my hunch.
@AArnott did you get a chance to check if the pipe was being disposed before calling cancel as Eric suggested?
@carlossanlop I didn't interpret @ericstj's comment as a suggestion toward me as much as speculation as to the conditions leading to the failure for consideration in a fix. It's quite possible that we close a pipe and then cancel a token that was used in I/O that was pending on that pipe. It's likely that races exist where the two steps happen concurrently, even if in this rather deterministic repro it's consistently ordered as dispose-then-cancel. Investigating the order this happens on our side would take a few hours at least. So if either of you two confirm it would help in your investigation and fix, we can pay that cost. But even if a race isn't responsible for this crash, the code does seem to be vulnerable to the race condition as well as simply sequentially executed dispose-and-cancel. I hope both can be fixed, if I'm right.
Description
In switching our process from .NET Framework to .NET Core, it (nearly?) always crashes on shutdown. The crashing exception is shown below, which reveals what appears to be an attempt to cancel async I/O. The CancellationToken registration callback calls
CancelIoEx
after checking that the handle is valid.CancelIoEx
then throws because the handle is disposed. Perhaps a race condition between completion of the async I/O and its cancellation?https://github.com/dotnet/runtime/blob/5d1af65dc66d289d64e54814a4d5e91412b75fe4/src/libraries/System.IO.Pipes/src/System/IO/Pipes/PipeCompletionSource.cs#L145
Instead of a crash, I expect canceling async I/O to not throw
ObjectDisposedException
and instead quietly cancel or complete.The exception thrown becomes a crash because I'm calling
CancellationTokenSource.Cancel()
on a threadpool thread, such that when one of the cancellation callbacks throws, it propagates through my frame and to the threadpool itself.Configuration
The dump came from .NET Core runtime
5.0.0-preview.7.20364.11 @Commit: 53976d38b1bd6917b8fa4d1dd4f009728ece3adb
x64 process.Regression?
Yes. The .NET Framework version of this process never crashed in this way as far as we've noticed.
Other information
dump file