CentaurusInfra / alnair

Intelligent platform for AI workloads
Apache License 2.0
37 stars 12 forks source link

Fractional GPU scheduling design review & implementation #35

Open Fizzbb opened 2 years ago

Fizzbb commented 2 years ago

Go over Kubernetes scheduling framework basics, Review the plan and plugin changes for scheduling Alnair vGPU.

Fizzbb commented 2 years ago

Figure out the plan, modify device plugin and write assigned GPU ID on the pod annotations, not node. Since we want to used ID disappear when the pod is complete. For scheduler, will scan all the running pods and figure out used ID and available ID.