kubewarden / kubewarden-controller

Manage admission policies in your Kubernetes cluster with ease
https://kubewarden.io
Apache License 2.0
191 stars 33 forks source link

Policy to enforce pod resource limits #590

Closed jvanz closed 9 months ago

jvanz commented 10 months ago

We have a user that is requesting a policy to validate pod resource limits. We cannot use the Gatekeeper policy for that because it uses regex built-ins. Which is not implemented in the Kubewarden stack yet. Therefore, we need to write a policy to allow the user to perform these validations. This could mean two options:

  1. Change the Gatekeeper policy removing the code that uses regex and build it as a Kubewarden policy
  2. Write a policy from scratch to validate the resource limit
flavio commented 10 months ago

We could also implement the missing built-ins inside of burrego.

However, there's also this built-in Kubernetes admission controller that could achieve the same result: https://kubernetes.io/docs/concepts/policy/limit-range/

The user could rely on that while we work on one of the solutions from above

elenagarciam commented 10 months ago

We could also implement the missing built-ins inside of burrego.

However, there's also this built-in Kubernetes admission controller that could achieve the same result: https://kubernetes.io/docs/concepts/policy/limit-range/

The user could rely on that while we work on one of the solutions from above

The limit range applies is too general so if we configure the same reservations to all namespaces then we come out to a great amount of reserved resources that probably are not used. The point is to enforce minimum request possible to the application which have not them configured in order drecrease the amount of unused resources requested

jvanz commented 10 months ago

I'm working in the policy it will firstly checks whether the container has any resource limits defined. After that, the policy verifies that the resource limit specified in the container is within the permissible range set by the maximum limit specified in the policy settings. If the container's resource limit exceeds the predefined maximum, the policy intervenes by allowing the user to mutate the container's settings, adjusting the limit to comply with the maximum value established in the policy configuration.

elenagarciam commented 10 months ago

I'm working in the policy it will firstly checks whether the container has any resource limits defined. After that, the policy verifies that the resource limit specified in the container is within the permissible range set by the maximum limit specified in the policy settings. If the container's resource limit exceeds the predefined maximum, the policy intervenes by allowing the user to mutate the container's settings, adjusting the limit to comply with the maximum value established in the policy configuration.

Great! This is exactly what we need Thanks

jvanz commented 10 months ago

Let me share a more format policy description:

Description

This policy is designed to enforce constraints on the CPU and memory resource limits of Kubernetes containers. It follows a two-step verification process: initially checking whether the container has defined resource limits, and subsequently ensuring that these limits fall within the permissible range set by the maximum limits configured in the policy settings.

Configuration

Users can configure the policy using the following parameters in YAML format:

# optional - the maximum permissible CPU limit
maxCPULimit: "2"
# optional - the maximum permissible memory limit
maxMemoryLimit: "4G"

Both CPU and memory limits should be expressed using the quantity definitions of Kubernetes.

Constraints

The policy assumes that users have defined both CPU and memory resource limits for their containers. If the specified CPU or memory limit exceeds the maximum permissible limit configured in the policy, the policy will intervene to mutate the container's settings. It adjusts the limit to comply with the configured maximum values.

Possible Scenarios

This policy operates as follows:

Scenario Configuration Outcome
Limits Within Range maxCPULimit: "2"
maxMemoryLimit: "4G"
Container settings accepted without modification if both CPU and memory limits are defined within permissible ranges.
Exceeds Permissible Max maxCPULimit: "2"
maxMemoryLimit: "4G"
Policy intervenes if either CPU or memory limit exceeds predefined maximums. Mutates container settings to comply.
Undefined Limit maxCPULimit: "2"
maxMemoryLimit: "4G"
Policy may take action if either CPU or memory limit is undefined (specific action not defined in provided description).
jvanz commented 10 months ago

@elenagarciam do you want to perform this validations on init and ephemeral containers as well?

elenagarciam commented 10 months ago

Hi actually we are more worried about request enforcement than limit ones. As the real problem is that if users don't configure request, they are set with the limits value. Which turns out of a huge amount of reserved resources but not used. For init and ephemeral containers we have not detected issues to resolve

flavio commented 9 months ago

Update: I thought more about this issue and had a chat with @fabriziosestito. This would be my updated proposal

Configuration

Users can configure the policy using the following parameters in YAML format:

# optional
memory:
  - defaultRequest: "100M"
  - defaultLimit: "500M"
  - maxLimit: "4G"
# optional
cpu:
  - defaultRequest: 100m
  - defaultLimit: 200m
  - maxLimit: 500m
# optional
ignoreImages: ["ghcr.io/foo/bar:1.23"]

Both CPU and memory limits should be expressed using the quantity definitions of Kubernetes.

The memory and cpu sections are optional, but once defined they must provide all their attributes.

The policy verifies the consistency of the values provides:

Behavior

The policy skips all the containers that are using an image that is part of the ignoreImages list. These containers are always considered valid and are never mutated.

When the CPU/Memory request is specified: no action or check is done against it. If the requested memory is higher than the limit the Pod will not be scheduled. This is the same approach taken by the LimitRange admission controller bundled with Kubernetes.

When the CPU/Memory request is not specified: the policy mutates the container definition, the defaultRequest value is used. The policy does not check the consistency of the applied value.

When the CPU/Memory limit is specified: the request is accepted if the limit defined by the container is less than or equal to the maxLimit. Otherwise the request is rejected. In this way the end user becomes aware of the issue and can ask the Kubernetes administrator to add the container image to the ignoreImages list.

When the CPU/Memory limit is not specified: the container is mutated to use the defaultLimit.

kravciak commented 9 months ago

ignoreImages: ["ghcr.io/foo/bar:1.23"]

What do you think about making tag optional? Otherwise end user needs to request ignoreImages change for each new tag...

jvanz commented 9 months ago

@elenagarciam what do you think about the proposed changes in the policy behaivour here?

jvanz commented 9 months ago

I've updated the PR implementing the suggested changes in the previous comment from @flavio.

elenagarciam commented 9 months ago

@elenagarciam what do you think about the proposed changes in the policy behaivour here?

Yes that configuration can work, when would the policy available?

jvanz commented 9 months ago

Yes that configuration can work, when would the policy available?

@elenagarciam The policy is released! https://artifacthub.io/packages/kubewarden/container-resources/container-resources

Let us know if you miss something. If there is no changes necessary, We'll bump it to version 0.1.0. (it's 0.1.0-rc1 now)

elenagarciam commented 9 months ago

Yes that configuration can work, when would the policy available?

@elenagarciam The policy is released! https://artifacthub.io/packages/kubewarden/container-resources/container-resources

Let us know if you miss something. If there is no changes necessary, We'll bump it to version 0.1.0. (it's 0.1.0-rc1 now)

Actually I was testing it, I have one question, Do I need to create another policy server o can I use the default one?

jvanz commented 9 months ago

Actually I was testing it, I have one question, Do I need to create another policy server o can I use the default one?

You can deploy the policy in any policy server.

jvanz commented 8 months ago

@elenagarciam We've just released another version of the policy with a quick fix for rancher UI and fix for mutating policies. Let us know what you experience using it. Thanks!