NVIDIA / k8s-dra-driver

Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes
Apache License 2.0
238 stars 43 forks source link

feat: add leader election on dra-controller #132

Closed JasonHe-WQ closed 1 month ago

JasonHe-WQ commented 3 months ago

The original controller does not contain leader election, which may cause uncertainty or abnormal behavior in case the DRA controller Pod crashes. The new function run wraps the main logic, and only the leader can execute function StartController. Therefore, the reliability of controller seems to be improved.