Open dmitsh opened 4 months ago
Yes, this bug was fixed in #1108, but no release contains it yet.
You can workaround it by deleting delay.durationFrom
and delay.jitterDurationFrom
.
I removed delay.durationFrom
and delay.jitterDurationFrom
from pod-complete.yaml, but it didn't help.
The pod gets completed right away.
I guess I should wait for the next KWOK release and test then.
I tested this scenario with the latest v0.6.0 release, and got the same outcome
I tested it and it worked as expected.
kind create cluster
helm repo add kwok https://kwok.sigs.k8s.io/charts/
helm upgrade --namespace kube-system --install kwok kwok/kwok
helm upgrade --install kwok kwok/stage-fast
wget https://github.com/kubernetes-sigs/kwok/raw/main/kustomize/stage/pod/general/pod-complete.yaml
# Set delay.durationMilliseconds to 10000 in pod-complete.yaml
# Set delay.jitterDurationMilliseconds to 20000 in pod-complete.yaml
kubectl apply -f pod-complete.yaml
# create node and job same with you
# Observe that the pods to complete between 10 and 20 seconds.
Could you explain to me the difference between durationMilliseconds
and jitterDurationMilliseconds
?
Is this documented somewhere?
I thought the durationMilliseconds is the time the pod is in the running state before switching to "completed", and the jitter is the random value (between 0 and jitterDurationMilliseconds) added to the durationMilliseconds.
I thought the durationMilliseconds is the time the pod is in the running state before switching to "completed", and the jitter is the random value (between 0 and jitterDurationMilliseconds) added to the durationMilliseconds.
Yes, the initial definition was the same as what you said. But setting the time for forced deletion is the reason for this.
FYI .metadata.deletionGracePeriodSeconds
is only graceful delete time and is not valid for the case
In fact, the definition of the API leaves a lot to be desired, it is planned to introduce CEL as a supplement to JQ.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
How to use it?
What happened?
I'm running KWOK v0.5.2 in a cluster. I deployed a set of stages from
stage-fast.yaml
. Then I updatedpod-complete
by copyingkustomize/stage/pod/general/pod-complete.yaml
, settingdelay.durationMilliseconds
to 10000, and deploying it in the cluster. Then I deploy a job. The job does not have annotationpod-complete.stage.kwok.x-k8s.io/delay
When the job starts, the pods are markedCompleted
right away:What did you expect to happen?
IIUC, in the absence of
pod-complete.stage.kwok.x-k8s.io/delay
annotation, thedelay.durationMilliseconds
specifies how long the pods should run before changing the status toCompleted
.How can we reproduce it (as minimally and precisely as possible)?
delay.durationMilliseconds
to 10000 inpod-complete.yaml
Anything else we need to know?
No response
Kwok version
kwok version 0.5.2
OS version