Closed andersekdahl closed 3 years ago
Hi @andersekdahl thanks for filing the issue. The callstack is helpful. Please give me some time to review the code to see if that could be the cause of the crash.
Having a local reproduce of the issue will be helpful for us to tracking down the it. Does the crash happen all the time? Do you know what are the steps to repro the crash? How's the memory usage look like before the crash?
At the same time, considering some bug fixes around the the call stack you posted, would you mind to try on a newer profiler: 2.2.0-beta4? The concerned issues fixed are:
I'll let you know once the related code is checked.
It doesn't crash all the time, there's usually a couple of days between the time we see this error. Have never had it happen locally, only sporadically on ACI. Unfortunately I can't see and direct patterns when it occurs, memory and CPU varies but is normal when it has happened.
I'll be sure to try the new beta as soon as OmI can, probably during the next couple of days.
Hi @andersekdahl Thanks for the quick turn-around. I finished review the code. By my understanding, I think things happen the other way around:
DisableAsync()
;There must be something else that is causing the crash of the container - within other parts of the profiler or out of profiler. We need to find out what that is.
Is there anything on the container instance side that we can review? Logs? Signals for crashing? Maybe try to file a support ticket on the container instance team for diagnosing what is causing the crash?
Let us know. Thanks.
@andersekdahl Just a casual check, did you find anything new?
Haven't been able to deploy the update for different reasons. We're still seeing the error every now and then, but it always happen at the same time as a crash. So I don't think this error is what caused the crash I reported.
@xiaomi7732 Just FYI, I just had the same error twice today with version v2.2.0.
It appears to be when the web server is shutting down.
2021-06-10T19:54:49.368262+00:00 Stopping all processes with SIGTERM
...
2021-06-10T19:54:55.491523+00:00 Unhandled exception. Microsoft.Diagnostics.NETCore.Client.ServerNotAvailableException: Process 3 not running compatible .NET Core runtime.
at Microsoft.Diagnostics.NETCore.Client.IpcClient.GetTransport(Int32 processId)
at Microsoft.Diagnostics.NETCore.Client.IpcClient.SendMessage(Int32 processId, IpcMessage message)
at Microsoft.Diagnostics.NETCore.Client.EventPipeSession.Stop()
at Microsoft.ApplicationInsights.Profiler.Core.TraceControls.DiagnosticsClientTraceControl.StopProfilerSession(Boolean disposeEventSessionImmediately)
at Microsoft.ApplicationInsights.Profiler.Core.TraceControls.DiagnosticsClientTraceControl.Disable()
at Microsoft.ApplicationInsights.Profiler.Core.TraceControls.DiagnosticsClientTraceControl.Dispose()
at Microsoft.Extensions.DependencyInjection.ServiceLookup.ServiceProviderEngineScope.DisposeAsync()
--- End of stack trace from previous location ---
at Microsoft.Extensions.Hosting.Internal.Host.DisposeAsync()
at Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.RunAsync(IHost host, CancellationToken token)
at Pym.Apps.WebServer.Program.Main(String[] args) in /src/Pym.Apps.WebServer/Program.cs:line 15
at Pym.Apps.WebServer.Program.<Main>(String[] args)
@dstj Thanks for the report and the callstack. I'll take a quick look.
This looks like a case where the application is shutting down, I'll seek to issue a fix to swallow ServerNotAvailableException for application shutting down.
The fix is released in 2.3.0-beta2.
We're using the profiler (2.2.0-beta2) on Azure Container Instances with Linux (mcr.microsoft.com/dotnet/core/aspnet:3.1-buster-slim) and we recieve a couple of errors in our logs from the profiler. Since it's beta we've thought it to be expected, but yesterday we got the following error in our logs:
We're definitely running .NET Core and only running .NET Core. This error happened at the same time as the container crashed. Did this error cause our crash?