Open Ramyak opened 3 years ago
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale /lifecycle frozen
This is expected behavior. ScheduleAnyway
constraint is processed during Scoring (Priority Function) part of the scheduling process in the scheduler. It is not processed during Filtering (Predicate) part of the scheduling process in the scheduler.
CA only uses Filtering part in the simulations (PreFilter
and Filter
extension points to be precise)
CheckPredicates
function
FitsAnyNodeMatching
function
ScheduleAnyway
is a part of scoring phase of the PodTopologySpread
plugin
While DoNotSchedule
is a part of the filter/predicate phase of the PodTopologySpread
plugin
As long as DoNotSchedule
is used, CA should respect the constraint. One problem I see with the current implementation in the CA is, we do not support a custom default constraint. We use the default one. If you specify a DoNotSchedule
custom default constraint, CA might not respect it.
@vadasambar
You said that CA respects default cluster constrain, but also that CA doesn't support ScheduleAnyway
.
But the default cluster constrain is ScheduleAnyway
according to the docs:
defaultConstraints:
- maxSkew: 3
topologyKey: "kubernetes.io/hostname"
whenUnsatisfiable: ScheduleAnyway
- maxSkew: 5
topologyKey: "topology.kubernetes.io/zone"
whenUnsatisfiable: ScheduleAnyway
Can you elaborate on this one, please?
@jdomag I recently wrote a blogpost on this (maybe this should be part of the docs) which might answer your question. Quoting the relevant part here:
CA imports the
PreFilter
andFilter
part of the default scheduler code i.e., it doesn’t allow making any changes to the default behavior. Because of this CA’s simulation of the scheduler won’t accurately reflect the actual scheduler running in your cluster since your cluster/control plane scheduler’s behavior would be different than CA’s simulated scheduler. This would create problems because CA’s autoscaling won’t accurately match the needs of your cluster. ...CA doesn’t consider
preferredDuringSchedulingIgnoredDuringExecution
because it is a part of Scoring phase of NodeAffinity scheduler plugin (comes in-built). Every scheduler plugin can act on multiple extension points.NodeAffinity
acts on extension points in both Filtering and Scoring phases. The only problem is, it considerspreferredDuringSchedulingIgnoredDuringExecution
only inScoring
phase (PreSCore
andScore
extension points to be precise) and not in Filtering phase. ...Similarly,
ScheduleAnyway
is a part of scoring phase of thePodTopologySpread
plugin
https://vadasambar.com/post/kubernetes/would-ca-consider-my-soft-constraints/
@vadasambar thanks, this is a great article, I wish it was part of the official docs :)
thanks, this is a great article, I wish it was part of the official docs :)
I will try proposing adding it to the docs in the upcoming SIG (and thank you :))
This hit us by surprise as well. In my opinion there should be a big red warning in the kubernetes docs: https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/#cluster-level-default-constraints. Currently, this looks like a stable feature but it can cripple your application if you are unlucky. We added a PR to warn future users.
Which component are you using?:
component: cluster-autoscaler
Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:
Scheduler now supports PodTopologySpread - Cluster-level default constraints from kubernetes release 1.18 - commit.
1. PodTopologySpread -
defaultConstraints
at the cluster level: Cluster-autoscaler does not consider PodTopologySpreaddefaultConstraints
at the cluster level.Pods remain unscheduled. You get the error. Note: Pod specs do not have
topologySpreadConstraints
in this case.2. PodTopologySpread when set at deployment: works since pod spec starts having topologySpreadConstraints.
Describe the solution you'd like.:
Cluster-autoscaler consider PodTopologySpread - Cluster-level default constraints during attempts to schedule pods
Describe any alternative solutions you've considered.:
Additional context.: