Open lukasfrank opened 1 month ago
+1 for the proposal. This addresses a lot of current pain points in the stack:
Machine
resource ended up on a given MachinePool
EPC Memory
Going into more detail. The current proposed approach follows a decentralized solution whereas the centralized one (in the alternatives section) follows the network stack solution. There are some disadvantages with a clustered solution, such as lack of guaranteed availability. For networking this is okay because if a critical infrastructure component is down, networking is affected anyway. The scheduling part should not have such a big impact for computing - using the decentralized approach would isolate impact on a pool level.
Another aspect with a centralized solution is you have to deal with a lot of "boilerplate" challenges such as "eventual consistency", giving room for possible race conditions. The Reservations
solution solves this pretty elegantly because you can introduce a time peroid until a scheduler waits for its decision and only takes the pools into consideration it finds in the status slice. Therefore, my vote goes with a decentralized approach here.
On the topic of the "scheduling decision". The reservation system is meant to have a decision who can provide the requested resources. Another controller can then use this to actually decide which one of those pools to actually use for the the Machine
resource. This enables a similar behavior compared with Node
<-> Pod
scheduling in vanilla Kubernetes. Another point to consider is that you can enable "system" reservations to accomodate for resources that are exclusively reserved for system applications.
Some aspects that are unclear for me:
rating
on a given status entry, is this similar to a priority
? How does this influence the scheduling decision?EPC Memory
? Or e.g. dedicated graphics cards?
- who decides the
rating
on a given status entry, is this similar to apriority
? How does this influence the scheduling decision?
Only the pool provider
can calculates the rating (since it's the component to check if the reservation can be fulfilled) and it is a metric on "how good the Reservation
fits onto the related pool". It should be understood as a hint for the scheduler to take the decision.
- how will arbitrary resources be announced, such as the already mentioned
EPC Memory
? Or e.g. dedicated graphics cards?
In the distributed approach: There is no need anymore for announcing resources. The resource "owner" (the pool provider
) is in charge of taking or rejecting the reservation and needs to keep track of all the resources. If arbitrary resources aren't available on a specific host, the reservation will be declined.
@balpert89 Does that make sense to you?
The rating
part is clear for me now, thanks for addressing.
In the distributed approach: There is no need anymore for announcing resources. The resource "owner" (the pool provider) is in charge of taking or rejecting the reservation and needs to keep track of all the resources. If arbitrary resources aren't available on a specific host, the reservation will be declined.
Does that mean we will deprecate the allocatable
/ available
(https://github.com/ironcore-dev/ironcore/blob/main/api/compute/v1alpha1/machinepool_types.go#L30-L33) fields as they are not required anymore?
The
rating
part is clear for me now, thanks for addressing.In the distributed approach: There is no need anymore for announcing resources. The resource "owner" (the pool provider) is in charge of taking or rejecting the reservation and needs to keep track of all the resources. If arbitrary resources aren't available on a specific host, the reservation will be declined.
Does that mean we will deprecate the
allocatable
/available
(https://github.com/ironcore-dev/ironcore/blob/main/api/compute/v1alpha1/machinepool_types.go#L30-L33) fields as they are not required anymore?
Correct, there would be no need for this fields anymore. In case if it's used to aggregate the resources of the entire infrastructure, we can offer metrics and aggregate it the kubernetes way.
Proposed Changes