microsoft / azure-container-apps

Roadmap and issues for Azure Container Apps
MIT License
362 stars 29 forks source link

Feature Request: Application Memory Dump #617

Open aarondandy opened 1 year ago

aarondandy commented 1 year ago

Is your feature request related to a problem? Please describe.
Sometimes software has memory leaks and other issues that impact application stability and performance. Being able to easily get a process memory dump greatly improves the chances of finding the issue and correcting it. Here is an example of an issue we recently encountered: https://github.com/dotnet/SqlClient/issues/1810 . It was extremely difficult to get a memory dump from the container, even after getting a terminal session within it, because getting a memory dump in combination with the memory leak being an issue caused replica restarts before we could complete and extract a dump. I had to migrate the background worker off of ACA and back onto Azure Web Apps to successfully and reliably get memory dumps in production to find the issue.

Describe the solution you'd like.
I would like an easy way within the Azure Portal to generate a memory dump for a process running within a container and download it to my local workstation while being able to use normal dotnet/aspnet container base images for our dotnet applications. Example: https://azure.github.io/AppService/2021/11/01/Diagnostic-Tools-for-ASP-NET-Core-Linux-apps-are-now-publicly-available.html#collection-in-kudu

Describe alternatives you've considered.
We have migrated off of ACA and back to Azure Web Apps in this situation and used their /newui tool to download a dotnet process memory dump: https://techcommunity.microsoft.com/t5/apps-on-azure-blog/new-kudu-ui-for-app-service-on-linux-preview/ba-p/3212270

Additional context.

TrayanZapryanov commented 1 year ago

Any news here? We also have similar problem and need dump functionality to understand the problem. Having 10 replicas working well and all of them behave similarly, the only one is constantly doubling CPU and memory. There is something specific on it, but without several dumps it is really hard to solve the problem only with metrics and log. Please suggest at least some workaround how to take a dump, event in 10 or more steps.

pmsousa commented 9 months ago

@SophCarp can you provide some insight about this? It would be valuable for Dev Teams to debug issues with container apps

AdeZwart commented 5 months ago

We also need this in our team! Please add something for it.