Open ashrafguitoni opened 1 year ago
I think its a valid point to discuss, we already do have PDBs for the important data-path components. But it might need some design on when to create/update/delete the PDBs (as min-scale
is not always set and scaling itself is dynamic, including to zero). Also there is a certain overhead to adjusting the PDBs on scaling changes, so up for discussion.
/triage accepted
I would find having PDBs created based on min-scale to be rather valuable for my workloads in a semi-disruptive environment with cluster-upgrades
I have had experience where roling nodes during cluster upgrades gets stuck due the KNative SPOF pre-set PDB's . It would good to have default values for these to allow at least single node disruptions during cluster node rolls and upgrades.
In what area(s)?
/area autoscale
Describe the feature
Although Knative autoscaling can maintain a number of minimum replicas per revision, I think this is only limited to actions that Knative controls. If other actors evict Knative service pods, then the service may have less available pods than the minimum replicas. One example of other actors that can mess up the Knative minimum state is the high-performance cluster autoscaler Karpenter which has a consolidation feature.
The way I'm trying to mitigate this problem is manually creating a
PodDisruptionBudget
targeting the pods of my Knative service, with the PDB'sminAvailable
value set to the KSVC'sautoscaling.knative.dev/min-scale
value.I was asked by @dprotaso to mention my case in a Github issue here, so please let me know what you think.