[BUG] The memory reservation setting after aks 1.29 is unreasonable

We used version 1.28.5 before and upgraded to 1.30. The node status changed to unknown and unready status many times. This problem never occurred before. We also checked the official documents of aks and found that the memory reservation was adjusted after aks1.29. Currently, the above-described problem has occurred in those who upgraded from 1.28 to 1.30, and everything is normal without upgrading. We looked at the reserved information 1.30 is as follows Allocatable: cpu: 15740m ephemeral-storage: 479347519924 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 59329368Ki pods: 60

1.28 is as follows Allocatable: cpu: 15740m ephemeral-storage: 119703055367 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 64460556Ki pods: 60

We suspect that there is a problem with the current reservation algorithm of AKS, which leads to insufficient node memory reservation, causing the node status to become unknown, resulting in the forced eviction of the pod where the node is located. I think this logic is problematic. In 1.28, there will be no problem with the node. Even if the memory usage exceeds the usage, only the pod that uses more memory will be evicted. Now the entire node cannot be used, which is a serious problem.

Azure / AKS

[BUG] The memory reservation setting after aks 1.29 is unreasonable #4524