Closed qiliRedHat closed 1 year ago
Test 1 PARAMETERS 1000 2 5 10
01-13 11:06:45.182 ====Results==== 01-13 11:06:45.182 Time taken for scaling up to 2 replicas for applications : 422 seconds 01-13 11:06:45.182 Time taken for scaling up to 5 replicas for applications : 433 seconds 01-13 11:06:45.182 Time taken for scaling up to 10 replicas for applications : 525 seconds 01-13 11:06:45.182 ====Test Passed====
Test 2 PARAMETERS 1000 20 ENV_VARS SCALE_ONLY=true
https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/scale-ci/job/e2e-benchmarking-multibranch-pipeline/job/regression-test/230/console 01-13 11:29:18.240 ====Results==== 01-13 11:29:18.240 Time taken for scaling up to 20 replicas for applications : 1010 seconds 01-13 11:29:18.240 ====Test Passed====
Some known issues with the test
I read the document about deploymentconfig, https://docs.openshift.com/container-platform/4.11/applications/deployments/what-deployments-are.html#deployments-design_what-deployments-are
For DeploymentConfig objects, if a node running a deployer pod goes down, it will not get replaced. The process waits until the node comes back online or is manually deleted. Manually deleting the node also deletes the corresponding pod. This means that you can not delete the pod to unstick the rollout, as the kubelet is responsible for deleting the associated pod.
In the test I used enable_spot_instance_workers: "no" to avoid worker node recreate that may cause deployment config failure.
% oc get po -A -l deploymentconfig=cakephp-mysql-persistent | grep -c Running
22774
% oc get po -A -l deploymentconfig=cakephp-mysql-persistent | grep -v Running | head -n 2
NAMESPACE NAME READY STATUS RESTARTS AGE
conc-registry-pull-1000 cakephp-mysql-persistent-1-542sg 0/1 Pending 0 147m
% oc describe po -n conc-registry-pull-1000 cakephp-mysql-persistent-1-542sg
....
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 149m default-scheduler 0/209 nodes are available: 200 Insufficient memory, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 9 node(s) didn't match Pod's node affinity/selector. preemption: 0/209 nodes are available: 200 No preemption victims found for incoming pod, 9 Preemption is not helpful for scheduling.
% oc get po -A -l deploymentconfig=cakephp-mysql-persistent --no-headers | wc -l
30000
1000 namespaces 40 replicas:
% oc get po -A -l deploymentconfig=akephp-mysql-persistent -v 9
I0113 13:20:33.150360 56389 loader.go:374] Config loaded from file: /Users/qili/Downloads/kubeconfig
I0113 13:20:33.161010 56389 round_trippers.go:466] curl -v -XGET -H "Accept: application/json;as=Table;v=v1;g=meta.k8s.io,application/json;as=Table;v=v1beta1;g=meta.k8s.io,application/json" -H "User-Agent: oc/4.12.0 (darwin/amd64) kubernetes/854f807" 'https://api.qili-awsovn4xns.qe.devcluster.openshift.com:6443/api/v1/pods?labelSelector=deploymentconfig%3Dakephp-mysql-persistent&limit=500'
I0113 13:20:33.168606 56389 round_trippers.go:495] HTTP Trace: DNS Lookup for api.qili-awsovn4xns.qe.devcluster.openshift.com resolved to [{52.15.154.40 } {3.17.213.176 } {3.139.154.187 }]
I0113 13:20:33.379476 56389 round_trippers.go:510] HTTP Trace: Dial to tcp:52.15.154.40:6443 succeed
I0113 13:20:36.168950 56389 round_trippers.go:553] GET https://api.qili-awsovn4xns.qe.devcluster.openshift.com:6443/api/v1/pods?labelSelector=deploymentconfig%3Dakephp-mysql-persistent&limit=500 200 OK in 3007 milliseconds
I0113 13:20:36.169085 56389 round_trippers.go:570] HTTP Statistics: DNSLookup 7 ms Dial 210 ms TLSHandshake 318 ms ServerProcessing 2469 ms Duration 3007 ms
I0113 13:20:36.169111 56389 round_trippers.go:577] Response Headers:
I0113 13:20:36.169137 56389 round_trippers.go:580] Date: Fri, 13 Jan 2023 05:20:36 GMT
I0113 13:20:36.169159 56389 round_trippers.go:580] Audit-Id: f36129c5-9c8a-4f50-8a13-12c30b7973b4
I0113 13:20:36.169181 56389 round_trippers.go:580] Cache-Control: no-cache, private
I0113 13:20:36.169202 56389 round_trippers.go:580] Content-Type: application/json
I0113 13:20:36.169223 56389 round_trippers.go:580] Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
I0113 13:20:36.169246 56389 round_trippers.go:580] X-Kubernetes-Pf-Flowschema-Uid: d0c9ddee-64a9-490a-97d7-1faa7e57d1c5
I0113 13:20:36.169267 56389 round_trippers.go:580] X-Kubernetes-Pf-Prioritylevel-Uid: b8b794ec-4994-46bd-9c83-0ef879e16eeb
I0113 13:20:36.169289 56389 round_trippers.go:580] Content-Length: 2936
I0113 13:20:36.169550 56389 request.go:1154] Response Body: {"kind":"Table","apiVersion":"meta.k8s.io/v1","metadata":{"resourceVersion":"1340642"},"columnDefinitions":[{"name":"Name","type":"string","format":"name","description":"Name must be unique within a namespace. Is required when creating resources, although some resources may allow a client to request the generation of an appropriate name automatically. Name is primarily intended for creation idempotence and configuration definition. Cannot be updated. More info: http://kubernetes.io/docs/user-guide/identifiers#names","priority":0},{"name":"Ready","type":"string","format":"","description":"The aggregate readiness state of this pod for accepting traffic.","priority":0},{"name":"Status","type":"string","format":"","description":"The aggregate status of the containers in this pod.","priority":0},{"name":"Restarts","type":"string","format":"","description":"The number of times the containers in this pod have been restarted and when the last container in this pod has restarted.","priority":0},{"name":"Age","type":"string","format":"","description":"CreationTimestamp is a timestamp representing the server time when this object was created. It is not guaranteed to be set in happens-before order across separate operations. Clients may not set this value. It is represented in RFC3339 form and is in UTC.\n\nPopulated by the system. Read-only. Null for lists. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata","priority":0},{"name":"IP","type":"string","format":"","description":"IP address allocated to the pod. Routable at least within the cluster. Empty if not yet allocated.","priority":1},{"name":"Node","type":"string","format":"","description":"NodeName is a request to schedule this pod onto a specific node. If it is non-empty, the scheduler simply schedules this pod onto that node, assuming that it fits resource requirements.","priority":1},{"name":"Nominated Node","type":"string","format":"","description":"nominatedNodeName is set only when this pod preempts other pods on the node, but it cannot be scheduled right away as preemption victims receive their graceful termination periods. This field does not guarantee that the pod will be scheduled on this node. Scheduler may decide to place the pod elsewhere if other nodes become available sooner. Scheduler may also decide to give the resources on this node to a higher priority pod that is created after preemption. As a result, this field may be different than PodSpec.nodeName when the pod is scheduled.","priority":1},{"name":"Readiness Gates","type":"string","format":"","description":"If specified, all readiness gates will be evaluated for pod readiness. A pod is ready when all its containers are ready AND all conditions specified in the readiness gates have status equal to \"True\" More info: https://git.k8s.io/enhancements/keps/sig-network/580-pod-readiness-gates","priority":1}],"rows":[]}
No resources found
% oc get po -n conc-registry-pull-1
NAME READY STATUS RESTARTS AGE
cakephp-mysql-persistent-1-26mxp 1/1 Running 0 25m
cakephp-mysql-persistent-1-2gjzm 1/1 Running 0 61m
....
@mffiedler and @paigerube14 PTAL
@paigerube14 I found this pr missed my attention and not merged after I created the test cases for 4.12.
I tested with this branch in 4.13, test passed. Please help to review and merge this PR.
New test case: OCP-9226 - Concurrent pull from the registry https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/scale-ci/job/e2e-benchmarking-multibranch-pipeline/job/regression-test/260/
Updated test case: OCP-26279 - [BZ 1752636] Networkpolicy should be applied for large namespaces https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/scale-ci/job/e2e-benchmarking-multibranch-pipeline/job/regression-test/259
@paigerube14 I did a rebase. Please help to review this when you have time.
/lgtm
To automate regression test case https://polarion.engineering.redhat.com/polarion/redirect/project/OSE/workitem?id=OCP-9226 OCP-9226 - Concurrent pull from the registry Jira task: https://issues.redhat.com/browse/OCPQE-13437
Steps