microsoft / service-fabric

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
https://docs.microsoft.com/en-us/azure/service-fabric/
MIT License
3.03k stars 399 forks source link

Troubleshooting High CPU usage for Actors #1263

Closed davidelettieri closed 3 years ago

davidelettieri commented 3 years ago

We are experiencing frequent 100% cpu usage for our actors. We have a 5 machine service fabric cluster on Azure and we are observing a recurring pattern were on one or more machines gradually the cpu usage increase in a few hours timespan up to 100%.

So far we have been restarting the actors from the service fabric explorer but it's becoming more and more frequent. Is there any guidance on how to troubleshoot such issues?

Environment:

davidelettieri commented 3 years ago

It looks like the process is spawning a lot of threads with very similar stack traces such as

1, ntoskrnl.exe!KeWaitForMultipleObjects+0x1284
2, ntoskrnl.exe!KeWaitForMultipleObjects+0xb3f
3, ntoskrnl.exe!KeWaitForMultipleObjects+0x4fe
4, ntoskrnl.exe!ObWaitForMultipleObjects+0x2c7
5, ntoskrnl.exe!IoFreeMiniCompletionPacket+0xe25
6, ntoskrnl.exe!setjmpex+0x6933
7, ntdll.dll!NtWaitForMultipleObjects+0x14
8, KernelBase.dll!WaitForMultipleObjectsEx+0xef
9, coreclr.dll!coreclr_initialize+0x7f6f6
10, coreclr.dll!coreclr_initialize+0x7f475
11, coreclr.dll!coreclr_initialize+0x7ef2b
12, coreclr.dll!coreclr_initialize+0x7ecdb
13, 0x7ffee2347e2a
craftyhouse commented 3 years ago

Please post questions/discussions to Q&A or Stackoverflow as suggested here https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-support#post-a-question-to-microsoft-qa