Closed wabouhamad closed 6 years ago
@sjug PTAL
Hey @wabouhamad, right now we only wait for pods to be in Running
state when your step size has been met. In this instance if you changed your TuningSet
step size to 10 rather than 40, it would wait.
If you want to give that a try to verify it works, that's the current implementation and not a bug. If we want to change behaviour, we should open a RFE instead.
Hey @sjug, with step size 10 it actually waited indefinitely since the pods were stuck in Pending state. When I hit control-C to kill it, it exited with a Failure message but it still shows 1 Passed 0 Failed 0 Pending:
Ran 1 of 99 Specs in 647.201 seconds FAIL! -- 1 Passed | 0 Failed | 0 Pending | 98 Skipped
Is this expected behavior ?
Feb 19 20:28:13.265: INFO: 1/1 : Created new namespace: pod-affinity-0
Feb 19 20:28:13.266: INFO: The loaded config file is: [{Name:hello-pod Image:docker.io/ocpqe/hello-pod Command:[] Args:[] WorkingDir: Ports:[{Name: HostPort:0 ContainerPort:8080 Protocol:TCP HostIP:}] EnvFrom:[] Env:[] Resources:{Limits:map[cpu:{i:{value:15 scale:-3} d:{Dec:<nil>} s:15m Format:DecimalSI} memory:{i:{value:52428800 scale:0} d:{Dec:<nil>} s:50Mi Format:BinarySI}] Requests:map[cpu:{i:{value:15 scale:-3} d:{Dec:<nil>} s:15m Format:DecimalSI} memory:{i:{value:52428800 scale:0} d:{Dec:<nil>} s:50Mi Format:BinarySI}]} VolumeMounts:[] VolumeDevices:[] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy: ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:&Capabilities{Add:[],Drop:[],},Privileged:*false,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,} Stdin:false StdinOnce:false TTY:false}]
Feb 19 20:28:13.266: INFO: Pod environment variables will not be modified.
Feb 19 20:28:13.266: INFO: 1/10 : Creating pod
Feb 19 20:28:14.283: INFO: 2/10 : Creating pod
Feb 19 20:28:14.293: INFO: 3/10 : Creating pod
Feb 19 20:28:14.300: INFO: 4/10 : Creating pod
Feb 19 20:28:14.310: INFO: 5/10 : Creating pod
Feb 19 20:28:14.321: INFO: 6/10 : Creating pod
Feb 19 20:28:14.330: INFO: 7/10 : Creating pod
Feb 19 20:28:14.339: INFO: 8/10 : Creating pod
Feb 19 20:28:14.346: INFO: 9/10 : Creating pod
Feb 19 20:28:14.358: INFO: 10/10 : Creating pod
Feb 19 20:28:14.369: INFO: Waiting for pods created this step to be running
^C
---------------------------------------------------------
Received interrupt. Running AfterSuite...
^C again to terminate immediately
Feb 19 20:38:53.947: INFO: Running AfterSuite actions on all node
Feb 19 20:38:53.947: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready
STEP: Destroying namespace "extended-test-cl-bnkh8-4nwnz" for this suite.
Feb 19 20:38:59.958: INFO: Waiting up to 30s for server preferred namespaced resources to be successfully discovered
Feb 19 20:39:00.005: INFO: namespace: extended-test-cl-bnkh8-4nwnz, resource: bindings, ignored listing per whitelist
Feb 19 20:39:00.069: INFO: namespace extended-test-cl-bnkh8-4nwnz deletion completed in 6.11985045s
Feb 19 20:39:00.069: INFO: Running AfterSuite actions on node 1
Ran 1 of 99 Specs in 647.201 seconds
FAIL! -- 1 Passed | 0 Failed | 0 Pending | 98 Skipped root@ip-172-31-24-19: ~/svt/openshift_scalability #
Are you asking if the waiting indefinitely is expected or the reponse to SIGINT?
The timeout for the tuningset is user configurable so just set the tuning.Pods.Stepping.Timeout
in the config and it will not wait indefinitely.
As far as the SIGINT handle, at the point where it is recieved during the wait, there's nothing after that would fail. So it is expected behaviour for CL.
@sjug, I was referring to the counts in the last FAIL message. Should the Failed counter be 1 and Passed 0, since none of the pods made it to running state, instead we see:
Ran 1 of 99 Specs in 647.201 seconds FAIL! -- 1 Passed | 0 Failed | 0 Pending | 98 Skipped
As far as the SIGINT handle, at the point where it is recieved during the wait, there's nothing after that would fail. So it is expected behaviour for CL.
How would it know if the pods made it into running state, you sent SIGINT during execution? As I said in my last message (quoted above) there's nothing else that would fail after that check, so therefore it passed all checks.
that's right since your are not trapping the SIGINT. Closing this issue since it is expected behavior. Thx.
Cluster Loader (test/extended/cluster/cl.go) does not report the correct number of pods it created when they are in Pending state. It reports success immediately after creating the pods and exits.
Version
openshift v3.9.0-0.45.0 kubernetes v1.9.1+a0ce1bc657 etcd 3.2.8
Steps To Reproduce
See config files in additional information section
Current Result
Pods get created and stay stuck in pending state. Cluster Loader exits with success result:
Ran 1 of 444 Specs in 6.740 seconds SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 443 Skipped PASS
Expected Result
Should show correct number of pending pods or wait for pods to get into running state before exiting.
Additional Information
~/svt/openshift_scalability/config/golang # cat pod-affinity-security-in-s1.yaml
cat ../../content/pod-pod-affinity-in-s1.json
--------------------------------- output:
============