camsas / firmament

The Firmament cluster scheduling platform
Apache License 2.0
412 stars 77 forks source link

Implementing pod anti-affinity #64

Open shivramsrivastava opened 6 years ago

shivramsrivastava commented 6 years ago

We have tried to implement a new cost model by referring to the net cost model. The new cost model named "CPU mem cost model" considers CPU and memory requirements for scheduling instead of network bandwidth. And we have designed and implemented soft constraints on top of this CPU mem cost model.

Please refer the design document of the same and provide your feedback.

In the above design document, we have mentioned the problem with implementing the pod anti-affinity. Please ( @ms705,@ICGog) provide your inputs on this design and suggest how we can efficiently implement pod anti-affinity for cpu mem cost model.

The CPU mem cost model implementation changes are open for review in below gerrithub link.

ms705 commented 6 years ago

I will try to take a look at this over the weekend; please ping me next week if you haven't heard back by then.

deepak-vij commented 6 years ago

That would be great Malte, appreciate it. We tried extending the XoR flow network construct based approach for solving Pod anti-affinity constraints as mentioned in Ionel thesis in conjunction with our new multi-dimensional CPU/memory cost model. The CPU/Memory cost model uses multiple ECs in order to distribute the incoming Pods evenly across filtered/relevant machines.

It would be really great if we can guidance from you folks in order to solve the Pod affinity/Anti-affinity constraints within Kubernetes.

Also, we have not yet started looking at the "AND" flow network construct approach for addressing Pod affinity constraints. based on the approach AND approach which Ionel suggested, it seems we would need to extend the min-cost solver to generalised min-cost flow networks. As I mentioned earlier, we have not started looking at the "AND" construct yet.

Our goal is to demonstrate that current Kubernetes scheduling policies can be implemented using the flow network graph approach. Although, I must say we are running into roadblocks for implementing Pod level affinity/anti-affinity constraints. We have successfully implemented node affinity/selector constraints already and the code is waiting for your review. Thanks.

deepak-vij commented 6 years ago

Hi Malte/Ionel, a friendly ping to find out if you folks had a chance to review the code we pushed a week or so ago. Thanks.

ICGog commented 6 years ago

Can you change the permissions of the design document so that we can comment on it?

shivramsrivastava commented 6 years ago

@ICGog can you please check now?

m1093782566 commented 6 years ago

cool!