FederatedResourceQuota should be failover friendly

mszacillo commented 1 month ago

What would you like to be added: A way for the FederatedResourceQuota to monitoring existing ResourceQuotas (without managing those ResourceQuotas) and impose resource limits on the user based off the sum of all currently used Quota.

Why is this needed: The existing FederatedResourceQuota mirrors the behavior of a typical Kubernetes ResourceQuota by imposing total resource limits in a multi-cluster setup. This is done by creating statically distributed ResourceQuotas across the specified member clusters, who's limits will total the limits defined in the FederatedResourceQuota. This works if the user does not need to worry about DR events which require back-up resources dedicated for failover in the event of a disaster.

In our case, since we are using Karmada for it's failover feature, we would like clusters to have additionally available quota for each namespace so that in the event of a DR event, all applicatons are able to be rescheduled:

Screenshot 2024-07-11 at 2 49 14 PM

In the diagram above, we can see that the total limits of the FederatedResourceQuota is 40/40 CPU and 50Gb / 50 Gb Memory. Individual clusters will have the same limit, so that in the case of a DR event, all workloads can be scheduled on one cluster.

Screenshot 2024-07-11 at 2 54 42 PM

Above, we see that during a failover all workloads from Cluster A will be migrated to Cluster B, where there will be enough available resources to schedule all required pods. With the existing statically defined ResourceQuotas, we cannot support this type of failover.

We've created this ticket to start a discussion on how best to address this limitation, and if this use-case is valid.

RainbowMango commented 1 month ago

First of all, thanks for bringing this up. I'm glad to enhance it with real-world use cases.

Generally, the FederatedResourceQuota is designed to enforce quota restrictions on Karmada control plane. But, currently, it just provides a capacity for administrators to manage the ResourceQuota across clusters, by the StaticAssignments. I guess, as lack of feedbacks, it saves the efforts to propagate Kubernetes ResourceQuota by a PropagationPolicy. The general ideas and possible approaches are listed in the comments..

RainbowMango commented 1 month ago

I guess your idea is to let the user declare a total quota by FedratedResourceQuota for a specific namespace, and the quota can be shared across clusters. In your first diagram, the total quota is 40 CPU, member1 and member2 used 20 each. At this point, no more applications(require CPU) can be scheduled to both clusters in that namespace as the total quota out. Since Karmada handles the process of failover, it knows after the failover, the quota will be released from member1, so it still can schedule application to member2 temporarily. Please correct me if I'm wrong.

mszacillo commented 1 month ago

Thanks for taking a look!

In your first diagram, the total quota is 40 CPU, member1 and member2 used 20 each. At this point, no more applications(require CPU) can be scheduled to both clusters in that namespace as the total quota out. Since Karmada handles the process of failover, it knows after the failover, the quota will be released from member1, so it still can schedule application to member2 temporarily. Please correct me if I'm wrong.

Pretty much yes. In our use-case, we have a controller that syncs a tenant's resourcequota on each member cluster to be equal to the tenant's limits (lets say 40CPU and 50GB). Each cluster will an identical static resourcequota (so that 1 cluster can accommodate all of the tenant's workloads if necessary in the case of DR).

But we want the FederatedResourceQuota to monitor the existing quota usage across all clusters and set a limit on the amount of resources that can be applied to the Karmada control plane. In the comments you linked these two were most relevant:

    //   - The rule about how to prevent workload from scheduling to cluster without quota.
    //   - The rule about how to prevent workload from creating to Karmada control plane.

Perhaps this would require some sort of admission webhook that would prevent resources from being applied if their total resource usage would go above the limits defined in the FederatedResourceQuota. This would mirror the way that ResourceQuotas are defined in K8s. The more difficult part would be determining when to replenish the quota (perhaps when a work is deleted?).

RainbowMango commented 1 month ago

Perhaps this would require some sort of admission webhook that would prevent resources from being applied if their total resource usage would go above the limits defined in the FederatedResourceQuota.

Yes, exactly. In addition, the scheduler also should take the resource quota into account and prevent scheduling workloads from clusters that exceed the limitation.

By the way, I might be slow to respond on this topic and I wish to pay more attention to #5116 and #5085 and the others we planned in the current release. But I'm interested and glad to have this discussion, and hoping keep this open and welcome other people to join this.

mszacillo commented 1 month ago

By the way, I might be slow to respond on this topic and I wish to pay more attention to https://github.com/karmada-io/karmada/pull/5116 and https://github.com/karmada-io/karmada/pull/5085 and the others we planned in the current release. But I'm interested and glad to have this discussion, and hoping keep this open and welcome other people to join this.

That's alright! Apologies for all the issues that have been filed recently - one at a time. :)

karmada-io / karmada

FederatedResourceQuota should be failover friendly #5179