Closed piotrzbyszynski closed 4 years ago
@jkotalik a possible issue with the in-proc console output?
@pzbyszynski 2.2 is end-of-life and no longer supported. Does the issue reproduce on 3.0 or 3.1 (the releases which are still in support)
This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment. If it is closed, feel free to comment when you are able to provide the additional information and we will re-investigate.
See our Issue Management Policies for more information.
@anurse Here is what is happening:
preprod
slotpreprod
slot (.net core 3.0 app) app worked without major problemsI have completely no idea what is happening. I see no exceptions and virtually no traffic in application insights. At the same time, I can not access my app. It seems like no traffic is directed to the .net core app after a swap
I have also set up completely new environment with my .net core app behind a traffic manager. I can access this app via traffic manager and directly. The traffic manager reports endpoint as online
. But as soon as I redirect my main domain to traffic manager the same story happens. I can not access my app (directly and via traffic manager) I see no traffic & exceptions in application insights, and traffic manager reports my site as degraded.
This is very frustrating, it makes our app inaccessible for our users during swaps, prevents us from releasing new versions and hurts our business. With 90K active users, it is a bit of a problem as you can imagine. So any help would be highly appreciated
Also, I have found the following messages in an application event log (maybe it will be helpful)
Error:
CInstrumentationMethod::Initialize - failed to initialize instrumentation method PID: 9180 hr: 80131375 name: Production Breakpoint Instrumentation Method
Followed by info:
Incoming-calls pipe shutdown.
@pzbyszynski Do you have the stdout log feature enabled in ASP.NET Core Module? Does the issue reproduce if you disable it? Or if you try the out-of-proc mode in ANCM? The stdout log isn't really designed as a logging system for use under high load in production systems (though I don't see why it shouldn't work).
- I have swapped slots, so all the traffic was routed to .NET Core 3.0 app and the app died
So, to be clear, the issue only occurs when under load?
Also, I have found the following messages in an application event log (maybe it will be helpful)
What are these being logged by? What's the event source, etc.?
It seems like something is corrupting the Stdout File Handle, which is breaking System.Console. We do mess with that in the in-proc ANCM handler in order to capture the stdout log so there might be something going on (that's why I suggested disabling that). To move forward with debugging this, we'll need to get something more complicated like a process dump. Be aware that process dumps contain the entire memory content of your application so you shouldn't share production process dumps in public forums like this. Azure/Microsoft Support is better equipped to handle interactive dump debugging.
We can't repro this in our environments
- I have upgraded my app to .NET Core 3.0
Also, note that 3.0 support is ending soon, I'd suggest migrating to 3.1 (which is a long-term support release). The migration path is very simple and should only require minor (if any) changes to your app.
@anurse
@pzbyszynski Do you have the stdout log feature enabled in ASP.NET Core Module? Does the issue reproduce if you disable it? Or if you try the out-of-proc mode in ANCM? The stdout log isn't really designed as a logging system for use under high load in production systems (though I don't see why it shouldn't work).
I have to investigate, but I doubt we use any of these. I will get back to you with this info
So, to be clear, the issue only occurs when under load?
I can not be sure, but this is exactly how it looks like
What are these being logged by? What's the event source, etc.?
I can access this logs via App Service -> Diagnose and solve problems -> Diagnostic tools -> Application event log
on Azure Portal
CInstrumentationMethod::Initialize - failed to initialize instrumentation method PID: 9180 hr: 80131375 name: Production Breakpoint Instrumentation Method
is logged by Instrumentation Engine
Incoming-calls pipe shutdown.
is logged by Production Brakepoints
P.S. We will upgrade to .net core 3.1 in the nearest feature
@anurse I can confirm that we do not use stdout log
Here is web.config
fragment:
<aspNetCore processPath="dotnet" arguments=".\PerfectGym.GoApi.dll" stdoutLogEnabled="false" stdoutLogFile=".\logs\stdout" hostingModel="InProcess" />
CInstrumentationMethod::Initialize - failed to initialize instrumentation method PID: 9180 hr: 80131375 name: Production Breakpoint Instrumentation Method
is logged by Instrumentation Engine
Incoming-calls pipe shutdown.
is logged by Production Brakepoints
Ok, I think these aren't related then.
@pzbyszynski let us know if upgrading to 3.1 fixes this issue!
This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment. If it is closed, feel free to comment when you are able to provide the additional information and we will re-investigate.
See our Issue Management Policies for more information.
@jkotalik I have upgraded to 3.1. I do not see the original error but I still can not get my app up and running on .NET Core & Azure. The app works under very light load (when tested on slot or when excluded from traffic manager), but as soon it starts receiving production traffic it dies.
I have turned on and off all the logging I know about (stdout, diagnostic logging, app service logs, application insights) and I do not see any meaningful errors. I did several memory dumps and profiler traces - nothing. I even set up a new app service plan in a different region, without any luck. I have engaged MS support, also without luck. After several seconds under production load app just dies, it just stops responding (like it stoped receiving traffic). It is super frustrating, and to be honest I have run out of ideas
@pzbyszynski Without seeing more of your application, we can't really help diagnose an issue like this (and in general Microsoft Product Support is the best way to get more in-depth help diagnosing performance issues in your app; feel free to let me know any issues you had when you contacted them)
Just chiming in - we're experiencing a similar error with an API application on .NET Core 3.1. First, when an app gets some traffic, we get an error along the lines of:
CInstrumentationMethod::Initialize - failed to initialize instrumentation method PID: 11280 hr: 80131375 name: Production Breakpoint Instrumentation Method
Source is "Instrumentation Engine"
After this, the app freezes, finally clients experience 502 Bad Gateway errors, and the app recycles and is back up again.
Not sure if it's the same issue exactly, but definitely sounds similar?
This has been reported to Microsoft Azure Support, and I'm waiting to hear back from them.
I believe those indicate an issue with the Azure monitoring components, so Azure Support is probably the right place to ask for assistance.
@koskila @anurse I made some progress on this issue. 2 things that helped for me:
ServicePointManager.DefaultConnectionLimit = 512;
System.Threading.ThreadPool.GetMaxThreads(out int _, out int completionThreads);
System.Threading.ThreadPool.SetMinThreads(512, completionThreads);
Based on this article (https://docs.microsoft.com/en-us/archive/blogs/vancem/diagnosing-net-core-threadpool-starvation-with-perfview-why-my-service-is-not-saturating-all-cores-or-seems-to-stall) I assumed that the problem is thread pool starvation.
I have checked my code, and in fact, I had few calls to async methods inside synchronous code, like so:
Task.Run(() => someMethodAsync()).Result;
In my mind, it should not cause thread blocking, but I have changed all my code to use proper async / await
all the way down. It didn't help much.
Bottom line is, that my app that serves 90K users before migration run on .NET Classic on 3 x S2 Azure App Service plan with plenty of headroom, and average response time < 200ms.
Now, the very same app runs on .NET Core on 6 x S2 App Service plan with average response time ~400ms. So now, I am trying to figure out how to use PerfView
to track down performance issues.
@anurse @jkotalik if you have any suggestions or ideas just drop me a line, please
One more thing, from Diagnose and solve problems -> Availability and performance
report, I can see lots of HTTP 400 (604) - Client Gone (The client that originated the request seems to have closed the connection before the response was received!)
errors (reported by load balancer). What is interesting, if I scale App Service plan down, I get more of these and app availability drops, but processor load is basically always around 50-60% - that is also why I suspect thread pool starvation
I managed to solve my problem. As I suspected it was caused by thread pool exhaustion.
The solution is using await / async
all the way down. In .NET classic, it was often not possible, but in .NET Core it is simply a must, so refactor your code when porting to .NET Core.
If you don't you probably won't even notice under moderate load, but under heavy load, it will bite your head off...
The solution is using
await / async
all the way down. In .NET classic, it was often not possible, but in .NET Core it is simply a must, so refactor your code when porting to .NET Core.
Yep, this is something we cover in our Performance Guide which I'd suggest taking a look at. Blocking calls in an async path are definitely problematic.
Thank you for contacting us. Due to a lack of activity on this discussion issue we're closing it in an effort to keep our backlog clean. If you believe there is a concern related to the ASP.NET Core framework, which hasn't been addressed yet, please file a new issue.
This issue will be locked after 30 more days of inactivity. If you still wish to discuss this subject after then, please create a new issue!
Hi,
after migrating my app from web api classic to web api core 2.2 my app fails to start on Azure App Service (locally it works just fine)
The exception message:
An error occurred while writing to logger(s). (The handle is invalid) The handle is invalid
Failed method:
System.ConsolePal+WindowsConsoleStream.Write
Stack Trace:
Also, in Azure application event logs I can see following errors:
Any help would by highly appreciated