Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.97k stars 310 forks source link

High memory consumption with v1.25.2 #3443

Open smartaquarius10 opened 1 year ago

smartaquarius10 commented 1 year ago

Team,

Since the day I have updated the AKS to v1.25.2, I can see huge spikes and node memory pressure issues.

Pods are going in evicted state and nodes are always consuming 135 to 140% of memory.. Till the time I was at 1.24.9 everything was working fine.

Just now, I saw that portal.azure.com has removed the v1.25.2 version from Create new-->Azure kubernetes cluster section. Does this version of AKS has any problem. Should we immediately switch to v1.25.4 for resolving memory issue.

I have also observed that AKS 1.24.x version had ubuntu 18 but AKS 1.25.x version has ubuntu 22. Is this the reason behind high memory consumption.

Kindly suggest.

Regards, Tanul


My AKS Configuration:- 8 nodes of Standard B2s size as its a non-prod environment. Pod structure:- Below are the listed pods and their memory consumption except the default microsoft pods(which are taking 4705 Mi of memory in total) running inside cluster

  • Dameon set of AAD pod identity:- Taking total 191 Mi of memory
  • Total 2 pods of kong :- Taking total 914 Mi Memory
  • Daemon set of twistlock vulnerability scanner:- Taking total 1276 Mi of memory
  • Total 10 pods of our .net microservices:- Taking total 820 Mi of memory
xuanra commented 1 year ago

Hello, We have the same problem with 1.25.4 version in our Company AKS.

We are trying to upgrade an app to openjdk17 to check if this new LTS Java version mitigates the problem.

Edit: In our case, .Net apps needed to change the nugget package for Application Insights.

Greets,

smartaquarius10 commented 1 year ago

@xuanra , My major pain point is these 2 pods out of 9 of them

My other pain point is these 16 pods(8 each)

They take 910 Mi of memory. I even raised the support ticket but customer support was unable to figure out whether we are using them or not. In addition, unable to suggest that when we should keep or why we should keep.

Still looking for the better solution to handle the non-prod environment...

lsavini-orienteed commented 1 year ago

Hello, we are facing the same problem of memory spikes moving from v1.23.5 to v1.25.4. We had to increase the memory limit of most of the containers

smartaquarius10 commented 1 year ago

@miwithro @ritazh @Karishma-Tiwari-MSFT @CocoWang-wql @jackfrancis @mainred

Hello,

Extremely sorry for tagging you. But our whole non prod environment is not working. We haven't upgraded our prod environment yet. However, engineers are unable to work on their applications.

Few days back, we have approached customer support for node performance issues but did not get any good response.

Would be really grateful for help and support on this as it seems to be a global problem.

smartaquarius10 commented 1 year ago

I need to share one finding. I have just created 2 different AKS clusters with v1.24.9 and v1.25.4 with 1 nodes of Standard B2s

These are the metrics. In case of v 1.25.4 there is a huge spike after enabling monitoring.

image

cedricfortin commented 1 year ago

We've got the same problem with memory after upgrading AKS from version 1.24.6 to 1.25.4:

In the monitoring of memory for the last month of one of our deployment, we can clearly see the memory usage increase after the update (01/23): imagen

xuanra commented 1 year ago

Hello, Our cluster has D4s_v3 machines. We still haven't found any patron in the apps that raised the memory demanded and the apps they don't between all our Java and .Net pods. One alternative to upload Java from 8 to 17 that one of our providers told us is to upload the version of our VM from D4s_v3 to D4s_v5 and we are studing the impact of this change.

Greets,

smartaquarius10 commented 1 year ago

@xuanra , I think in that case B2s are totally out of picture for this upgrade.. The max they are capable of supporting is till 1.24.x version of AKS

ganga1980 commented 1 year ago

@xuanra , My major pain point is these 2 pods out of 9 of them

  • ama-logs
  • ama-logs-rs They always takes more that 400 Mi of memory.. Its very difficult to accommodate them in B2S nodes.

My other pain point is these 16 pods(8 each)

  • csi-azuredisk-node
  • csi-azurefile-node

They take 910 Mi of memory. I even raised the support ticket but customer support was unable to figure out whether we are using them or not. In addition, unable to suggest that when we should keep or why we should keep.

Still looking for the better solution to handle the non-prod environment...

Hi, @smartaquarius10 , Thanks for the feedback. We have work planned to reduce the ama-logs agent memory foot print and we will update the exact timelines and additional details of the improvements in early March. cc: @pfrcks

smartaquarius10 commented 1 year ago

@ganga1980 @pfrcks

Thank you so much Ganga.. We are heavily impacted because of this. Till 1.24.x version of AKS we were running 3 environments within our AKS. But, after upgrading to 1.25.x version we are unable to manage even 1 environment.

Each environment has 11 pods.

Would be grateful for your support on this. I have already disabled the csi pods as we are not using any storage. For now, should we disable these ama monitoring pods as well..

If yes, then once your team resolve these issues should we upgrade our AKS again to some specific version or microsoft will resolve from backend in every version of AKS infra.

Thank you

Kind Regards, Tanul

smartaquarius10 commented 1 year ago

Hello @ganga1980 @pfrcks ,

Hope you are doing well. By any chance, is it possible to speed up the process a little.. Actually our 2 environments (which is 22 micro services) are down because of this.

Appreciate your help and support in this matter. Thank you. Have a great day.

Hello @xuanra @cedricfortin @lsavini-orienteed, Did you find any workaround for this. Thanks :)

Kind Regards, Tanul

gonpinho commented 1 year ago

Hi @smartaquarius10, we updated the k8s version of AKS to 1.25.5 this week and start suffering from the same issue.

In our case, we identified a problem with the JRE version when dealing with cgroups v2. Here I share my findings:

Kubernetes cgroups v2 reached GA on the version 1.25.x and with this change AKS changed the OS of the nodes from Ubuntu18.04 to Ubuntu22.04 that already uses cgroups v2 by default.

The problem of our containarized apps were related with a bug on JRE 11.0.14. This JRE didn't had support for cgroups v2 container awareness. This means that the container were not able to respect the imposed memory quotas defined on the deployment descriptor.

Oracle and OpenJDK addressed this issue by supporting it natively on JRE 17 and backporting this fix to JRE 15 and JRE 11.0.16++.

I've updated the base image to use a fixed JRE version (11.0.18) and the memory exhaustion was solved.

Regarding AMA pods, I've compared the pods running on k8s 1.25.X with the pods running on 1.24.X and in my opinion seems stable as the memory footprint is literally the same.

Hope this helps!

smartaquarius10 commented 1 year ago

@gonpinho , Thanks a lot for sharing the details. But the problem is that our containerized apps are not taking extra memory.. They are still occupying the same as they were taking before with 1.24.x..

What I realized is that I have created a fresh cluster 1.24.x and 1.25.x and by default memory occupancy is appox. 30% more in 1.25.x..

My one environment takes only 1 GB of memory consisting of 11 pods.. With AKS 1.24.x I was running 3 environments in total. The moment I shifted to 1.25.x I have to disable 2 environments along with the microsoft CSI addons as well just to accommodate the 11 custom pods because the node memory consumption is already high.

smartaquarius10 commented 1 year ago

@gonpinho , By any chance if I can downgrade the OS again to ubuntu 18.0.4 then it would be my first preference. I know that upgrade to ubuntu OS is killing the machines. No idea how to handle this.

pintmil commented 1 year ago

Hi, we facing with the same problem after upgrading our dev AKS cluster to 1.25.5 from 1.23.12. Our company develops c/c++ and c# services, so we don't suffer from JRE cgroup v2 issues. We see that memory usage is increasing over time, but nothing - just kube-system pods - are running on the cluster. The sympthoms is that kubectl top no shows much more memory consumption than free on the host OS (ubuntu 22.04). If we force host OS to drop cached memory with the command sudo sh -c 'echo 1 > /proc/sys/vm/drop_caches' the used memory isn't changing but some of the buff/cache memory moves to free, and after it the kubectl top no shows memory usage drop on that node. We came to conclusion, that k8s calculates buff/cache memory into used memory, but it is wrong, because linux OS will use free memory to buffer IO and other things, and it is completely normal operation.

kubectl top no before cache drop: Screenshot_20230302_104524

free before / after cache drop: Screenshot_20230302_104702

kubectl top no after cache drop: Screenshot_20230302_104737

shiva-appani-hash commented 1 year ago

Team, we are seeing the same behaviour after upgrading the cluster from 1.23.12 to 1.25.5. All the microservices running in clusters are .Net3.1. On raising a support request, we got to know that cgroup version has been changes to v2, does anyone have similar scenario. How do we identify cgroup v1 is used in .net 3.1 and can it be the cause for high memory consumption,

smartaquarius10 commented 1 year ago

Hello @ganga1980, Any update on this please.. Thank you

ganga1980 commented 1 year ago

Hello @ganga1980, Any update on this please.. Thank you @smartaquarius10 , We are working on rolling out our March agent release, which would bring down the usage ama-logs daemonset (linux) by 80 to 100MB. I dont have your cluster name or cluster resource id to investigate and we cant repro the issue you have reported. Please create an support ticket with clusterResourceId details so that we can investigate. The workaround you can try applying the default configmap through kubectl apply -f https://raw.githubusercontent.com/microsoft/Docker-Provider/ci_prod/kubernetes/container-azm-ms-agentconfig.yaml

smartaquarius10 commented 1 year ago

@ganga1980 , Thank you for the reply. Just a quick question. After raising the support ticket should I send a mail to you on your microsoft id with the details regarding support ticket. Otherwise, it will assign to L1 support which will take a lot of time to get to the resolution.

Or else, if you allow, I can ping you my cluster details on MS teams.

The way you like 😃

Currently, ama pods are taking approx. 326Mi of memory/node

smartaquarius10 commented 1 year ago

@ganga1980, We already have this config map

andyzhangx commented 1 year ago

@ganga1980 for the csi driver resource usage, if you don't need csi driver, you could disable those drivers, follow by: https://learn.microsoft.com/en-us/azure/aks/csi-storage-drivers#disable-csi-storage-drivers-on-a-new-or-existing-cluster

Marchelune commented 1 year ago

Hi! It seems we are facing the same issue in 1.25.5. We upgraded a few weeks (24.02) ago and the memory usage (container working set memory) jumped from the moment of the upgrade, according to the metrics tab:

Screenshot 2023-03-09 at 18 06 54 copy

We are using Standard_B2s vms, as this is an internal development cluster - csi drivers are not enabled. Is the issue identified or is there still an investigation on this?

codigoespagueti commented 1 year ago

Same issue here after upgrading to 1.25.5. We are using FS2_v2 and we were not able to have the Working Set memory below 100% no matter how many nodes we added to the cluster.

Very dissapointing that all the memory in the Node is used and reserved by Azure Pods.

We had to disable Azure Insights in the cluster.

image

ghost commented 1 year ago

@vishiy, @saaror would you be able to assist?

Issue Details
Team, Since the day I have updated the AKS to v1.25.2, I can see huge spikes and node memory pressure issues. Pods are going in evicted state and nodes are always consuming 135 to 140% of memory.. Till the time I was at 1.24.9 everything was working fine. Just now, I saw that [portal.azure.com](portal.azure.com) has removed the v1.25.2 version from **Create new-->Azure kubernetes cluster** section. Does this version of AKS has any problem. Should we immediately switch to v1.25.4 for resolving memory issue. I have also observed that AKS 1.24.x version had `ubuntu 18` but AKS 1.25.x version has `ubuntu 22`. Is this the reason behind high memory consumption. Kindly suggest. ----------------------------------------------------------------------------------------------- > My AKS Configuration:- 8 nodes of Standard B2s size as its a non-prod environment. > Pod structure:- Below are the listed pods and their memory consumption except the default microsoft pods(which are taking 4705 Mi of memory in total) running inside cluster > - Dameon set of AAD pod identity:- Taking total 191 Mi of memory > - Total 2 pods of kong :- Taking total 914 Mi Memory > - Daemon set of twistlock vulnerability scanner:- Taking total 1276 Mi of memory > - Total 10 pods of our .net microservices:- Taking total 820 Mi of memory
Author: smartaquarius10
Assignees: -
Labels: `bug`, `azure/oms`, `addon/container-insights`
Milestone: -
smartaquarius10 commented 1 year ago

@codigoespagueti @Marchelune Yeah, even we are planning to disable azure insights(ama agent pods). However, we are performing few steps for enabling at least one more environment. Not having at least 2 environments was highly jeopardizing the productivity of my team members. For now, at least out of 2 environments are working out of 3 environments.

Now, we are waiting till the end of march as @ganga1980 team is working on the ama agent pods. If it worked then cool otherwise, we will disable monitoring pods as well.

Kind Regards, Tanul

JonasJes commented 1 year ago

Same problem here this is a single pod before and after update with the same codebase 19dadab7-11e9-4f68-aa6c-414bd424a3c3

unluckypixie commented 1 year ago

This might help some of you, Kubernetes 1.25 included an update to use cgroups v2 api (cgroups is basically how Kubernetes passes settings to the containers).

When this happened on docker-desktop for me, the memory limits on containers simply stopped having any effect - if you asked the container about it's memory it would basically report the amount of system memory on the host.

My solution was to re-enable the deprecated cgroupsv1 api and it all magically worked again ...

So long as you are using a new enough linux kernel I believe cgroupsv2 should work, but it didn't work for me and I'm yet to work out exactly why, but I strongly suggest all these issues are regarding the cgroups change - it DOESN'T only affect java as I think some people seem to believe, it's a linux kernel thing.

Here is a link about the change : https://kubernetes.io/blog/2022/08/31/cgroupv2-ga-1-25/

smartaquarius10 commented 1 year ago

@unluckypixie , Thanks for sharing. How to enable that in AKS. Could you please share the details. Thank you

NattyPradeep commented 1 year ago

Hi Team, We are also seeing high memory consumptions after AKS Upgrade!. Do we have any resolutions yet?.

smartaquarius10 commented 1 year ago

@unluckypixie, Could you please share the process of re-enabling the cgroups v1

smartaquarius10 commented 1 year ago

Hey @ganga1980, Hope you are doing well.

Did you get any updates on ama pods memory usage issue. Can we expect the resolution of memory footprint by the end of march?

Thank you

Kind Regards, Tanul

smartaquarius10 commented 1 year ago

Can anyone confirm how much memory ama-logs pods are taking in your AKS nodes. In my case, its 2911 Mi for 8 nodes after excluding the logs of ingress-controller namespace using configmaps of ama-logs pod

kubectl top pods -A|grep ama-|awk '{ print $4 }'|sed s/Mi//g|awk '{ sum+=$1 } END { print sum }'

PeterThomasAwen commented 1 year ago

@unluckypixie If possible can you please describe the process to re-enable tmpfs? (cgroupsv1 api)

maxkt commented 1 year ago

@unluckypixie Please, share how you managed to re-enable the cgroups v1.

martindisch commented 1 year ago

For those that are here because your Node.js application suffers from not being able to set its heap limit correctly with cgroup v2: you can work around it with --max-old-space-size.

It will take a while before cgroup v2 support makes it into LTS, because it depends on a libuv release. For details see https://github.com/nodejs/node/issues/47259.

unluckypixie commented 1 year ago

@smartaquarius10 @PeterThomasAwen @maxkt

I might not have been clear, I've only managed to fix it in docker-desktop so far - if that is what you were asking for you simply edit the settings.json and change:

"deprecatedCgroupv1": true

We are assessing updating our kubernetes to v1.25.5 and will let you know if we figure out the fix!

smartaquarius10 commented 1 year ago

@unluckypixie.. Oh ok.. Thank you so much for sharing. Yeah that could be possible in self hosted kubernetes. Not sure though.

If it works then still, I don't think it's possible in AKS whose master node is managed by Microsoft. I'm just guessing.. But, would be awesome if it can be done with AKS..

EvertonSA commented 1 year ago

I'm also highly impacted by this issue and I wish I have seem this issue before.

smartaquarius10 commented 1 year ago

@ganga1980, Any updates on this. So many people are impacted because of this ubuntu upgrade. Would be grateful if you could expedite the process of this ama-log. At least, it will compensate the things to certain extent. Thank you.

saparicio commented 1 year ago

For java I've mitigated this by upgrading images from JRE 11.0.13 to JRE 11.0.18 I'd say it is definitely related to this https://bugs.openjdk.org/browse/JDK-8230305 which was backported to 11.0.16

With JRE 11.0.13 (after 10 minutes running) image

and with JRE 11.0.18 (after 10 minutes running) image

Only the workload with the arrow was upgraded

smartaquarius10 commented 1 year ago

@ganga1980 @pfrcks, Any updates on this ama pod memory footprint issue please. Thank you.

jevinjixu commented 1 year ago

Just FYI, as a remedy solution we add extra jvm parameter -Xmx when start application (update Dockerfile and rebuilt image). This solved our issue for the moment.

brunoborges commented 1 year ago

I'm curious why some users are deploying Java to Kubernetes without setting a heap size.

Anyone mind to share their thinking?

saparicio commented 1 year ago

I'm curious why some users are deploying Java to Kubernetes without setting a heap size.

Anyone mind to share their thinking?

We are setting both, Java heap and resource limits in k8s. But, after the upgrade to 1.25, the workloads started to use 2 or 3 times more memory. In our case, upgrading the JRE to version 11.0.18 (probably 11.0.16 would work as well) made our workloads use the same memory they used before the k8s upgrade

brunoborges commented 1 year ago

We are setting both, Java heap and resource limits in k8s. But, after the upgrade to 1.25, the workloads started to use 2 or 3 times more memory.

Can you share how you were setting the heap size before you upgraded to 11.0.18 ?

saparicio commented 1 year ago

We are setting both, Java heap and resource limits in k8s. But, after the upgrade to 1.25, the workloads started to use 2 or 3 times more memory.

Can you share how you were setting the heap size before you upgraded to 11.0.18 ?

In JAVA_OPTS environment variable with "-Xmx"

brunoborges commented 1 year ago

So, what you are saying is that before 1.25, you were already setting heap size with -Xmx, and then you saw the memory consumption increase?

saparicio commented 1 year ago

So, what you are saying is that before 1.25, you were already setting heap size with -Xmx, and then you saw the memory consumption increase?

We always had the Xmx (before and with 1.25). When we upgraded k8s, a lot of workloads started to fail and reboot because they wanted to use much more memory (2 or 3 times more). I saw the high memory consumption because we had to drastically increase Xmx and resource limits in K8S to make the workloads work

After the modification of their images to upgrade the JRE to 11.0.18 all of them are back to normal memory usage and working fine again (with the same Xmx and resource limits we had before the upgrade to 1.25).

smartaquarius10 commented 1 year ago

@ganga1980 Any updates on ama-logs pod memory footprint.

javiermarasco commented 1 year ago

@shiva-appani-gep maybe a bit late but seems .NET 3.1 uses cgroupV1, and any .NET >5 uses cgroupV2 (being .NET 6.0 the actual LTS)

I did a test with an app that was using .net 3.1, changed to .net 6.0 and you can see the results.

The image is: Above the app with .net 3.1 (memory consumption) Below the same app with .net 6.0 (memory consumption)

image