kubernetes / test-infra

Test infrastructure for the Kubernetes project.
Apache License 2.0
3.83k stars 2.65k forks source link

K8s Prow is limited to 1k concurrent pods, or podgc will fight with sinker #11594

Closed mm4tt closed 5 years ago

mm4tt commented 5 years ago

Our scalability tests started behaving strangely over the weekend. The prow jobs running tests are scheduled when they shouldn't.

Example: name: ci-kubernetes-e2e-gke-large-performance-regional Config:
oqm1ekq6cvr 1

Test should be run once a week, but it has been scheduled 4 times over last weekend. qtykinscgey

There are other jobs behaving similarly to this one, i.e. they are scheduled and run when they shouldn't be run.

This is wreaking havoc in our scalability tests. Due to quota issues, the tests share the same gcp projects. Now, because they're run when shouldn't, they started interfering with each other causing multiple tests to fail.

mm4tt commented 5 years ago

/priority critical-urgent

mm4tt commented 5 years ago

@fejta, could you take a look or reassign? /assign @fejta

mm4tt commented 5 years ago

/sig testing

stevekuznetsov commented 5 years ago

@mm4tt FYI the best place to escalate something like this is in #testing-ops on Slack, pinging @test-infra-oncall. I've shot a message over here

spiffxp commented 5 years ago

/milestone v1.14 /unassign @fejta not available at the moment /assign @amwat as 1.14 test-infra lead, and currently on-call per: go.k8s.io/oncall

BenTheElder commented 5 years ago

There are other jobs behaving similarly to this one, i.e. they are scheduled and run when they shouldn't be run.

could you list them?

Strangely https://prow.k8s.io/?job=ci-kubernetes-e2e-gke-large-performance-regional has one entry at Mar 03 00:01:39

https://prow.k8s.io/rerun?prowjob=931c4290-3d8a-11e9-9c9a-0a580a6c0e78

```yaml apiVersion: prow.k8s.io/v1 kind: ProwJob metadata: annotations: prow.k8s.io/job: ci-kubernetes-e2e-gke-large-performance-regional creationTimestamp: null labels: created-by-prow: "true" preset-k8s-ssh: "true" preset-service-account: "true" prow.k8s.io/id: 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 prow.k8s.io/job: ci-kubernetes-e2e-gke-large-performance-regional prow.k8s.io/type: periodic name: e89951f0-3e98-11e9-b844-0a580a6c0923 spec: agent: kubernetes cluster: default job: ci-kubernetes-e2e-gke-large-performance-regional namespace: test-pods pod_spec: containers: - args: - --timeout=600 - --repo=k8s.io/kubernetes=master - --repo=k8s.io/perf-tests=master - --root=/go/src - --scenario=kubernetes_e2e - -- - --cluster=gke-regional-cluster - --deployment=gke - --extract=ci/latest-1.13 - --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging - --gcp-node-image=gci - --gcp-project=kubernetes-scale - --gcp-region=us-east1 - --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 - --gke-environment=staging - --gke-node-locations=us-east1-b - --gke-shape={"default":{"Nodes":1999,"MachineType":"n1-standard-1"},"heapster-pool":{"Nodes":1,"MachineType":"n1-standard-8"}} - --provider=gke - --test=false - --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh - --test-cmd-args=cluster-loader2 - --test-cmd-args=--nodes=2000 - --test-cmd-args=--provider=gke - --test-cmd-args=--report-dir=/workspace/_artifacts - --test-cmd-args=--testconfig=testing/density/config.yaml - --test-cmd-args=--testconfig=testing/load/config.yaml - --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml - --test-cmd-name=ClusterLoaderV2 - --timeout=570m - --use-logexporter env: - name: GOOGLE_APPLICATION_CREDENTIALS value: /etc/service-account/service-account.json - name: E2E_GOOGLE_APPLICATION_CREDENTIALS value: /etc/service-account/service-account.json - name: USER value: prow - name: JENKINS_GCE_SSH_PRIVATE_KEY_FILE value: /etc/ssh-key-secret/ssh-private - name: JENKINS_GCE_SSH_PUBLIC_KEY_FILE value: /etc/ssh-key-secret/ssh-public image: gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master name: "" resources: requests: cpu: "6" memory: 16Gi volumeMounts: - mountPath: /etc/service-account name: service readOnly: true - mountPath: /etc/ssh-key-secret name: ssh readOnly: true volumes: - name: service secret: secretName: service-account - name: ssh secret: defaultMode: 256 secretName: ssh-key-secret type: periodic status: startTime: "2019-03-04T16:16:46Z" state: triggered ```

https://testgrid.k8s.io/sig-scalability-gke#gke-large-performance-regional

stevekuznetsov commented 5 years ago

@BenTheElder I recently added more logging to horologium to discern why a job was triggered -- what are the logs saying there?

BenTheElder commented 5 years ago

All of the entries in testgrid are showing the same pod ID on deck, 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 (click "more info" on the "Test started ..." box)

https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/161 https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/162 https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/163 https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/164

BenTheElder commented 5 years ago

the logs for horologium don't seem to show much so far (stackdriver export for horologium pod, text:ci-kubernetes-e2e-gke-large-performance-regional, back to 3/2/19 2:00:00 AM)

```json [ { "insertId": "1ulicp5fepvsxu", "jsonPayload": { "level": "info", "should-trigger": true, "name": "931c4290-3d8a-11e9-9c9a-0a580a6c0e78", "component": "horologium", "type": "periodic", "job": "ci-kubernetes-e2e-gke-large-performance-regional", "msg": "Triggering new run of cron periodic.", "previous-found": true }, "resource": { "type": "container", "labels": { "zone": "us-central1-f", "pod_id": "horologium-78fb7b98f8-tn8sc", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "horologium", "namespace_id": "default", "instance_id": "5347705516640603225" } }, "timestamp": "2019-03-03T08:01:39Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "horologium-78fb7b98f8-tn8sc", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-blkqd" }, "logName": "projects/k8s-prow/logs/horologium", "receiveTimestamp": "2019-03-03T08:01:45.470497268Z" }, { "insertId": "15wtu9nfepwm0g", "jsonPayload": { "component": "horologium", "msg": "Triggering cron job ci-kubernetes-e2e-gke-large-performance-regional.", "client": "cron", "level": "info" }, "resource": { "type": "container", "labels": { "zone": "us-central1-f", "pod_id": "horologium-78fb7b98f8-tn8sc", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "horologium", "namespace_id": "default", "instance_id": "5347705516640603225" } }, "timestamp": "2019-03-03T08:01:00Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "horologium-78fb7b98f8-tn8sc", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-blkqd" }, "logName": "projects/k8s-prow/logs/horologium", "receiveTimestamp": "2019-03-03T08:01:06.402048467Z" } ] ```
BenTheElder commented 5 years ago

plank for text:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 🤔

```json [ { "insertId": "a85i6flj274m", "jsonPayload": { "component": "plank", "msg": "ReplaceProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/931c4290-3d8a-11e9-9c9a-0a580a6c0e78 933a4706-3d8a-11e9-898b-42010a80003a 189614343 1 2019-03-03 08:01:39 +0000 UTC map[created-by-prow:true preset-k8s-ssh:true preset-service-account:true prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional prow.k8s.io/type:periodic] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {periodic kubernetes default test-pods ci-kubernetes-e2e-gke-large-performance-regional [] false 0 false &PodSpec{Volumes:[{service {nil nil nil nil nil SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}],Containers:[{ gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}],RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[],Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[],AutomountServiceAccountToken:nil,Tolerations:[],HostAliases:[],PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[],RuntimeClassName:nil,} } {2019-03-03 08:01:39 +0000 UTC pending Job triggered. https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/165/ 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 165 map[github-reporter:pending]}})", "client": "kube", "level": "debug" }, "resource": { "type": "container", "labels": { "zone": "us-central1-f", "pod_id": "plank-9f6cb7fbb-4jdf2", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-04T13:08:23Z", "severity": "ERROR", "labels": { "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g", "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-04T13:08:28.783717431Z" }, { "insertId": "a85i6flj274l", "jsonPayload": { "client": "kube", "level": "debug", "component": "plank", "msg": "GetProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78)" }, "resource": { "type": "container", "labels": { "pod_id": "plank-9f6cb7fbb-4jdf2", "zone": "us-central1-f", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-04T13:08:23Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-04T13:08:28.783717431Z" }, { "insertId": "a85i6flj2748", "jsonPayload": { "msg": "ReplaceProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/931c4290-3d8a-11e9-9c9a-0a580a6c0e78 933a4706-3d8a-11e9-898b-42010a80003a 189487735 1 2019-03-03 08:01:39 +0000 UTC map[created-by-prow:true preset-k8s-ssh:true preset-service-account:true prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional prow.k8s.io/type:periodic] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {periodic kubernetes default test-pods ci-kubernetes-e2e-gke-large-performance-regional [] false 0 false &PodSpec{Volumes:[{service {nil nil nil nil nil SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}],Containers:[{ gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}],RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[],Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[],AutomountServiceAccountToken:nil,Tolerations:[],HostAliases:[],PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[],RuntimeClassName:nil,} } {2019-03-03 08:01:39 +0000 UTC pending Job triggered. https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/165/ 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 165 map[github-reporter:pending]}})", "client": "kube", "level": "debug", "component": "plank" }, "resource": { "type": "container", "labels": { "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921", "zone": "us-central1-f", "pod_id": "plank-9f6cb7fbb-4jdf2", "project_id": "k8s-prow" } }, "timestamp": "2019-03-04T13:08:23Z", "severity": "ERROR", "labels": { "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g", "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-04T13:08:28.783717431Z" }, { "insertId": "a85i6flj2747", "jsonPayload": { "job": "ci-kubernetes-e2e-gke-large-performance-regional", "msg": "Pod is missing, starting a new pod", "level": "info", "name": "931c4290-3d8a-11e9-9c9a-0a580a6c0e78", "type": "periodic", "component": "plank" }, "resource": { "type": "container", "labels": { "zone": "us-central1-f", "pod_id": "plank-9f6cb7fbb-4jdf2", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-04T13:08:23Z", "severity": "ERROR", "labels": { "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g", "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-04T13:08:28.783717431Z" }, { "insertId": "a85i6flj2746", "jsonPayload": { "component": "plank", "msg": "CreatePod({{ } {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 0 0001-01-01 00:00:00 +0000 UTC map[prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 created-by-prow:true prow.k8s.io/type:periodic prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional preset-k8s-ssh:true preset-service-account:true] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {[{service {nil nil nil nil nil &SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}] [] [{test gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil} {BUILD_ID 165 nil} {BUILD_NUMBER 165 nil} {JOB_NAME ci-kubernetes-e2e-gke-large-performance-regional nil} {JOB_SPEC {\"type\":\"periodic\",\"job\":\"ci-kubernetes-e2e-gke-large-performance-regional\",\"buildid\":\"165\",\"prowjobid\":\"931c4290-3d8a-11e9-9c9a-0a580a6c0e78\"} nil} {JOB_TYPE periodic nil} {PROW_JOB_ID 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}] Never map[] 0xc02e0680b8 false false false nil [] nil [] [] nil [] } { [] [] [] }})", "client": "kube", "level": "debug" }, "resource": { "type": "container", "labels": { "zone": "us-central1-f", "pod_id": "plank-9f6cb7fbb-4jdf2", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-04T13:08:23Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-04T13:08:28.783717431Z" }, { "insertId": "1vzsmovfk87wnw", "jsonPayload": { "component": "plank", "msg": "ReplaceProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/931c4290-3d8a-11e9-9c9a-0a580a6c0e78 933a4706-3d8a-11e9-898b-42010a80003a 189487735 1 2019-03-03 08:01:39 +0000 UTC map[created-by-prow:true preset-k8s-ssh:true preset-service-account:true prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional prow.k8s.io/type:periodic] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {periodic kubernetes default test-pods ci-kubernetes-e2e-gke-large-performance-regional [] false 0 false &PodSpec{Volumes:[{service {nil nil nil nil nil SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}],Containers:[{ gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}],RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[],Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[],AutomountServiceAccountToken:nil,Tolerations:[],HostAliases:[],PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[],RuntimeClassName:nil,} } {2019-03-03 08:01:39 +0000 UTC pending Job triggered. https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/164/ 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 164 map[github-reporter:pending]}})", "client": "kube", "level": "debug" }, "resource": { "type": "container", "labels": { "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921", "pod_id": "plank-9f6cb7fbb-4jdf2", "zone": "us-central1-f", "project_id": "k8s-prow", "cluster_name": "prow" } }, "timestamp": "2019-03-04T05:43:23Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-04T05:43:28.556197331Z" }, { "insertId": "1vzsmovfk87wnv", "jsonPayload": { "component": "plank", "msg": "GetProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78)", "client": "kube", "level": "debug" }, "resource": { "type": "container", "labels": { "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921", "pod_id": "plank-9f6cb7fbb-4jdf2", "zone": "us-central1-f", "project_id": "k8s-prow" } }, "timestamp": "2019-03-04T05:43:23Z", "severity": "ERROR", "labels": { "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g", "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-04T05:43:28.556197331Z" }, { "insertId": "1vzsmovfk87wnl", "jsonPayload": { "component": "plank", "msg": "ReplaceProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/931c4290-3d8a-11e9-9c9a-0a580a6c0e78 933a4706-3d8a-11e9-898b-42010a80003a 189363471 1 2019-03-03 08:01:39 +0000 UTC map[created-by-prow:true preset-k8s-ssh:true preset-service-account:true prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional prow.k8s.io/type:periodic] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {periodic kubernetes default test-pods ci-kubernetes-e2e-gke-large-performance-regional [] false 0 false &PodSpec{Volumes:[{service {nil nil nil nil nil SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}],Containers:[{ gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}],RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[],Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[],AutomountServiceAccountToken:nil,Tolerations:[],HostAliases:[],PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[],RuntimeClassName:nil,} } {2019-03-03 08:01:39 +0000 UTC pending Job triggered. https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/164/ 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 164 map[github-reporter:pending]}})", "client": "kube", "level": "debug" }, "resource": { "type": "container", "labels": { "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921", "zone": "us-central1-f", "pod_id": "plank-9f6cb7fbb-4jdf2", "project_id": "k8s-prow", "cluster_name": "prow" } }, "timestamp": "2019-03-04T05:43:23Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-04T05:43:28.556197331Z" }, { "insertId": "1vzsmovfk87wnk", "jsonPayload": { "msg": "Pod is missing, starting a new pod", "job": "ci-kubernetes-e2e-gke-large-performance-regional", "level": "info", "name": "931c4290-3d8a-11e9-9c9a-0a580a6c0e78", "type": "periodic", "component": "plank" }, "resource": { "type": "container", "labels": { "pod_id": "plank-9f6cb7fbb-4jdf2", "zone": "us-central1-f", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-04T05:43:23Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-04T05:43:28.556197331Z" }, { "insertId": "1vzsmovfk87wnj", "jsonPayload": { "msg": "CreatePod({{ } {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 0 0001-01-01 00:00:00 +0000 UTC map[preset-k8s-ssh:true preset-service-account:true prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 created-by-prow:true prow.k8s.io/type:periodic prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {[{service {nil nil nil nil nil &SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}] [] [{test gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil} {BUILD_ID 164 nil} {BUILD_NUMBER 164 nil} {JOB_NAME ci-kubernetes-e2e-gke-large-performance-regional nil} {JOB_SPEC {\"type\":\"periodic\",\"job\":\"ci-kubernetes-e2e-gke-large-performance-regional\",\"buildid\":\"164\",\"prowjobid\":\"931c4290-3d8a-11e9-9c9a-0a580a6c0e78\"} nil} {JOB_TYPE periodic nil} {PROW_JOB_ID 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}] Never map[] 0xc010a48048 false false false nil [] nil [] [] nil [] } { [] [] [] }})", "client": "kube", "level": "debug", "component": "plank" }, "resource": { "type": "container", "labels": { "pod_id": "plank-9f6cb7fbb-4jdf2", "zone": "us-central1-f", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-04T05:43:23Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-04T05:43:28.556197331Z" }, { "insertId": "10cniqefl3qiev", "jsonPayload": { "client": "kube", "level": "debug", "component": "plank", "msg": "ReplaceProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/931c4290-3d8a-11e9-9c9a-0a580a6c0e78 933a4706-3d8a-11e9-898b-42010a80003a 189363471 1 2019-03-03 08:01:39 +0000 UTC map[prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional prow.k8s.io/type:periodic created-by-prow:true preset-k8s-ssh:true preset-service-account:true] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {periodic kubernetes default test-pods ci-kubernetes-e2e-gke-large-performance-regional [] false 0 false &PodSpec{Volumes:[{service {nil nil nil nil nil SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}],Containers:[{ gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}],RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[],Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[],AutomountServiceAccountToken:nil,Tolerations:[],HostAliases:[],PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[],RuntimeClassName:nil,} } {2019-03-03 08:01:39 +0000 UTC pending Job triggered. https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/163/ 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 163 map[github-reporter:pending]}})" }, "resource": { "type": "container", "labels": { "pod_id": "plank-9f6cb7fbb-4jdf2", "zone": "us-central1-f", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-03T22:26:54Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T22:27:01.550080143Z" }, { "insertId": "10cniqefl3qieu", "jsonPayload": { "client": "kube", "level": "debug", "component": "plank", "msg": "GetProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78)" }, "resource": { "type": "container", "labels": { "namespace_id": "default", "instance_id": "7000980459144515921", "zone": "us-central1-f", "pod_id": "plank-9f6cb7fbb-4jdf2", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank" } }, "timestamp": "2019-03-03T22:26:54Z", "severity": "ERROR", "labels": { "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g", "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T22:27:01.550080143Z" }, { "insertId": "10cniqefl3qiee", "jsonPayload": { "client": "kube", "level": "debug", "component": "plank", "msg": "ReplaceProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/931c4290-3d8a-11e9-9c9a-0a580a6c0e78 933a4706-3d8a-11e9-898b-42010a80003a 189240028 1 2019-03-03 08:01:39 +0000 UTC map[prow.k8s.io/type:periodic created-by-prow:true preset-k8s-ssh:true preset-service-account:true prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {periodic kubernetes default test-pods ci-kubernetes-e2e-gke-large-performance-regional [] false 0 false &PodSpec{Volumes:[{service {nil nil nil nil nil SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}],Containers:[{ gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil}] {map[] map[memory:{{17179869184 0} {} BinarySI} cpu:{{6 0} {} 6 DecimalSI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}],RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[],Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[],AutomountServiceAccountToken:nil,Tolerations:[],HostAliases:[],PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[],RuntimeClassName:nil,} } {2019-03-03 08:01:39 +0000 UTC pending Job triggered. https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/163/ 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 163 map[github-reporter:pending]}})" }, "resource": { "type": "container", "labels": { "pod_id": "plank-9f6cb7fbb-4jdf2", "zone": "us-central1-f", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-03T22:26:54Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T22:27:01.550080143Z" }, { "insertId": "10cniqefl3qied", "jsonPayload": { "msg": "Pod is missing, starting a new pod", "job": "ci-kubernetes-e2e-gke-large-performance-regional", "level": "info", "name": "931c4290-3d8a-11e9-9c9a-0a580a6c0e78", "type": "periodic", "component": "plank" }, "resource": { "type": "container", "labels": { "zone": "us-central1-f", "pod_id": "plank-9f6cb7fbb-4jdf2", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-03T22:26:54Z", "severity": "ERROR", "labels": { "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g", "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T22:27:01.550080143Z" }, { "insertId": "10cniqefl3qiec", "jsonPayload": { "msg": "CreatePod({{ } {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 0 0001-01-01 00:00:00 +0000 UTC map[created-by-prow:true prow.k8s.io/type:periodic prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional preset-k8s-ssh:true preset-service-account:true prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {[{service {nil nil nil nil nil &SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}] [] [{test gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil} {BUILD_ID 163 nil} {BUILD_NUMBER 163 nil} {JOB_NAME ci-kubernetes-e2e-gke-large-performance-regional nil} {JOB_SPEC {\"type\":\"periodic\",\"job\":\"ci-kubernetes-e2e-gke-large-performance-regional\",\"buildid\":\"163\",\"prowjobid\":\"931c4290-3d8a-11e9-9c9a-0a580a6c0e78\"} nil} {JOB_TYPE periodic nil} {PROW_JOB_ID 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}] Never map[] 0xc012316018 false false false nil [] nil [] [] nil [] } { [] [] [] }})", "client": "kube", "level": "debug", "component": "plank" }, "resource": { "type": "container", "labels": { "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921", "zone": "us-central1-f", "pod_id": "plank-9f6cb7fbb-4jdf2", "project_id": "k8s-prow", "cluster_name": "prow" } }, "timestamp": "2019-03-03T22:26:54Z", "severity": "ERROR", "labels": { "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g", "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T22:27:01.550080143Z" }, { "insertId": "1o7ph7cfh465cc", "jsonPayload": { "msg": "ReplaceProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/931c4290-3d8a-11e9-9c9a-0a580a6c0e78 933a4706-3d8a-11e9-898b-42010a80003a 189240028 1 2019-03-03 08:01:39 +0000 UTC map[prow.k8s.io/type:periodic created-by-prow:true preset-k8s-ssh:true preset-service-account:true prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {periodic kubernetes default test-pods ci-kubernetes-e2e-gke-large-performance-regional [] false 0 false &PodSpec{Volumes:[{service {nil nil nil nil nil SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}],Containers:[{ gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}],RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[],Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[],AutomountServiceAccountToken:nil,Tolerations:[],HostAliases:[],PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[],RuntimeClassName:nil,} } {2019-03-03 08:01:39 +0000 UTC pending Job triggered. https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/162/ 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 162 map[github-reporter:pending]}})", "client": "kube", "level": "debug", "component": "plank" }, "resource": { "type": "container", "labels": { "zone": "us-central1-f", "pod_id": "plank-9f6cb7fbb-4jdf2", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-03T15:15:55Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T15:15:59.459168535Z" }, { "insertId": "1o7ph7cfh465cb", "jsonPayload": { "client": "kube", "level": "debug", "component": "plank", "msg": "GetProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78)" }, "resource": { "type": "container", "labels": { "pod_id": "plank-9f6cb7fbb-4jdf2", "zone": "us-central1-f", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-03T15:15:55Z", "severity": "ERROR", "labels": { "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g", "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T15:15:59.459168535Z" }, { "insertId": "1o7ph7cfh4659z", "jsonPayload": { "client": "kube", "level": "debug", "component": "plank", "msg": "ReplaceProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/931c4290-3d8a-11e9-9c9a-0a580a6c0e78 933a4706-3d8a-11e9-898b-42010a80003a 189117035 1 2019-03-03 08:01:39 +0000 UTC map[created-by-prow:true preset-k8s-ssh:true preset-service-account:true prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional prow.k8s.io/type:periodic] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {periodic kubernetes default test-pods ci-kubernetes-e2e-gke-large-performance-regional [] false 0 false &PodSpec{Volumes:[{service {nil nil nil nil nil SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}],Containers:[{ gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}],RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[],Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[],AutomountServiceAccountToken:nil,Tolerations:[],HostAliases:[],PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[],RuntimeClassName:nil,} } {2019-03-03 08:01:39 +0000 UTC pending Job triggered. https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/162/ 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 162 map[github-reporter:pending]}})" }, "resource": { "type": "container", "labels": { "zone": "us-central1-f", "pod_id": "plank-9f6cb7fbb-4jdf2", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-03T15:15:55Z", "severity": "ERROR", "labels": { "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g", "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T15:15:59.459168535Z" }, { "insertId": "1o7ph7cfh4659y", "jsonPayload": { "job": "ci-kubernetes-e2e-gke-large-performance-regional", "msg": "Pod is missing, starting a new pod", "level": "info", "name": "931c4290-3d8a-11e9-9c9a-0a580a6c0e78", "component": "plank", "type": "periodic" }, "resource": { "type": "container", "labels": { "zone": "us-central1-f", "pod_id": "plank-9f6cb7fbb-4jdf2", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-03T15:15:55Z", "severity": "ERROR", "labels": { "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g", "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T15:15:59.459168535Z" }, { "insertId": "1o7ph7cfh4659x", "jsonPayload": { "msg": "CreatePod({{ } {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 0 0001-01-01 00:00:00 +0000 UTC map[preset-service-account:true prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 preset-k8s-ssh:true created-by-prow:true prow.k8s.io/type:periodic prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {[{service {nil nil nil nil nil &SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}] [] [{test gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil} {BUILD_ID 162 nil} {BUILD_NUMBER 162 nil} {JOB_NAME ci-kubernetes-e2e-gke-large-performance-regional nil} {JOB_SPEC {\"type\":\"periodic\",\"job\":\"ci-kubernetes-e2e-gke-large-performance-regional\",\"buildid\":\"162\",\"prowjobid\":\"931c4290-3d8a-11e9-9c9a-0a580a6c0e78\"} nil} {JOB_TYPE periodic nil} {PROW_JOB_ID 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}] Never map[] 0xc0121e6294 false false false nil [] nil [] [] nil [] } { [] [] [] }})", "client": "kube", "level": "debug", "component": "plank" }, "resource": { "type": "container", "labels": { "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921", "pod_id": "plank-9f6cb7fbb-4jdf2", "zone": "us-central1-f", "project_id": "k8s-prow", "cluster_name": "prow" } }, "timestamp": "2019-03-03T15:15:55Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T15:15:59.459168535Z" }, { "insertId": "1vpritkfhzolmx", "jsonPayload": { "client": "kube", "level": "debug", "component": "plank", "msg": "ReplaceProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/931c4290-3d8a-11e9-9c9a-0a580a6c0e78 933a4706-3d8a-11e9-898b-42010a80003a 189117014 1 2019-03-03 08:01:39 +0000 UTC map[created-by-prow:true preset-k8s-ssh:true preset-service-account:true prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional prow.k8s.io/type:periodic] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {periodic kubernetes default test-pods ci-kubernetes-e2e-gke-large-performance-regional [] false 0 false &PodSpec{Volumes:[{service {nil nil nil nil nil SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}],Containers:[{ gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}],RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[],Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[],AutomountServiceAccountToken:nil,Tolerations:[],HostAliases:[],PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[],RuntimeClassName:nil,} } {2019-03-03 08:01:39 +0000 UTC pending Job triggered. https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/161/ 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 161 map[github-reporter:pending]}})" }, "resource": { "type": "container", "labels": { "pod_id": "plank-9f6cb7fbb-4jdf2", "zone": "us-central1-f", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-03T08:01:56Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T08:02:01.492391989Z" }, { "insertId": "1vpritkfhzolmw", "jsonPayload": { "component": "plank", "msg": "GetProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78)", "client": "kube", "level": "debug" }, "resource": { "type": "container", "labels": { "pod_id": "plank-9f6cb7fbb-4jdf2", "zone": "us-central1-f", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-03T08:01:56Z", "severity": "ERROR", "labels": { "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g", "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T08:02:01.492391989Z" }, { "insertId": "1vpritkfhzollp", "jsonPayload": { "component": "plank", "msg": "ReplaceProwJob(931c4290-3d8a-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/931c4290-3d8a-11e9-9c9a-0a580a6c0e78 933a4706-3d8a-11e9-898b-42010a80003a 189116931 1 2019-03-03 08:01:39 +0000 UTC map[prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 created-by-prow:true preset-k8s-ssh:true preset-service-account:true prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional prow.k8s.io/type:periodic] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {periodic kubernetes default test-pods ci-kubernetes-e2e-gke-large-performance-regional [] false 0 false &PodSpec{Volumes:[{service {nil nil nil nil nil SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}],Containers:[{ gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}],RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[],Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[],AutomountServiceAccountToken:nil,Tolerations:[],HostAliases:[],PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[],RuntimeClassName:nil,} } {2019-03-03 08:01:39 +0000 UTC pending Job triggered. https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-large-performance-regional/161/ 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 161 map[]}})", "client": "kube", "level": "debug" }, "resource": { "type": "container", "labels": { "namespace_id": "default", "instance_id": "7000980459144515921", "pod_id": "plank-9f6cb7fbb-4jdf2", "zone": "us-central1-f", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank" } }, "timestamp": "2019-03-03T08:01:55Z", "severity": "ERROR", "labels": { "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g", "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T08:02:01.492391989Z" }, { "insertId": "1vpritkfhzollo", "jsonPayload": { "msg": "Transitioning states.", "job": "ci-kubernetes-e2e-gke-large-performance-regional", "to": "pending", "level": "info", "name": "931c4290-3d8a-11e9-9c9a-0a580a6c0e78", "component": "plank", "type": "periodic", "from": "triggered" }, "resource": { "type": "container", "labels": { "pod_id": "plank-9f6cb7fbb-4jdf2", "zone": "us-central1-f", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-03T08:01:55Z", "severity": "ERROR", "labels": { "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default", "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T08:02:01.492391989Z" }, { "insertId": "1vpritkfhzolln", "jsonPayload": { "component": "plank", "msg": "CreatePod({{ } {931c4290-3d8a-11e9-9c9a-0a580a6c0e78 0 0001-01-01 00:00:00 +0000 UTC map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional preset-k8s-ssh:true preset-service-account:true prow.k8s.io/id:931c4290-3d8a-11e9-9c9a-0a580a6c0e78 created-by-prow:true prow.k8s.io/type:periodic] map[prow.k8s.io/job:ci-kubernetes-e2e-gke-large-performance-regional] [] nil [] } {[{service {nil nil nil nil nil &SecretVolumeSource{SecretName:service-account,Items:[],DefaultMode:nil,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}} {ssh {nil nil nil nil nil &SecretVolumeSource{SecretName:ssh-key-secret,Items:[],DefaultMode:*256,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}] [] [{test gcr.io/k8s-testimages/kubekins-e2e:v20190301-76bc03340-master [] [--timeout=600 --repo=k8s.io/kubernetes=master --repo=k8s.io/perf-tests=master --root=/go/src --scenario=kubernetes_e2e -- --cluster=gke-regional-cluster --deployment=gke --extract=ci/latest-1.13 --gcp-cloud-sdk=gs://cloud-sdk-testing/ci/staging --gcp-node-image=gci --gcp-project=kubernetes-scale --gcp-region=us-east1 --gke-create-command=container clusters create --quiet --enable-ip-alias --create-subnetwork name=ip-alias-subnet-regional --cluster-ipv4-cidr=/12 --services-ipv4-cidr=/19 --gke-environment=staging --gke-node-locations=us-east1-b --gke-shape={\"default\":{\"Nodes\":1999,\"MachineType\":\"n1-standard-1\"},\"heapster-pool\":{\"Nodes\":1,\"MachineType\":\"n1-standard-8\"}} --provider=gke --test=false --test-cmd=$GOPATH/src/k8s.io/perf-tests/run-e2e.sh --test-cmd-args=cluster-loader2 --test-cmd-args=--nodes=2000 --test-cmd-args=--provider=gke --test-cmd-args=--report-dir=/workspace/_artifacts --test-cmd-args=--testconfig=testing/density/config.yaml --test-cmd-args=--testconfig=testing/load/config.yaml --test-cmd-args=--testoverrides=./testing/density/2000_nodes/override.yaml --test-cmd-name=ClusterLoaderV2 --timeout=570m --use-logexporter] [] [] [{GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {E2E_GOOGLE_APPLICATION_CREDENTIALS /etc/service-account/service-account.json nil} {USER prow nil} {JENKINS_GCE_SSH_PRIVATE_KEY_FILE /etc/ssh-key-secret/ssh-private nil} {JENKINS_GCE_SSH_PUBLIC_KEY_FILE /etc/ssh-key-secret/ssh-public nil} {BUILD_ID 161 nil} {BUILD_NUMBER 161 nil} {JOB_NAME ci-kubernetes-e2e-gke-large-performance-regional nil} {JOB_SPEC {\"type\":\"periodic\",\"job\":\"ci-kubernetes-e2e-gke-large-performance-regional\",\"buildid\":\"161\",\"prowjobid\":\"931c4290-3d8a-11e9-9c9a-0a580a6c0e78\"} nil} {JOB_TYPE periodic nil} {PROW_JOB_ID 931c4290-3d8a-11e9-9c9a-0a580a6c0e78 nil}] {map[] map[cpu:{{6 0} {} 6 DecimalSI} memory:{{17179869184 0} {} BinarySI}]} [{service true /etc/service-account } {ssh true /etc/ssh-key-secret }] [] nil nil nil nil false false false}] Never map[] 0xc006506a04 false false false nil [] nil [] [] nil [] } { [] [] [] }})", "client": "kube", "level": "debug" }, "resource": { "type": "container", "labels": { "zone": "us-central1-f", "pod_id": "plank-9f6cb7fbb-4jdf2", "project_id": "k8s-prow", "cluster_name": "prow", "container_name": "plank", "namespace_id": "default", "instance_id": "7000980459144515921" } }, "timestamp": "2019-03-03T08:01:55Z", "severity": "ERROR", "labels": { "compute.googleapis.com/resource_name": "fluentd-gcp-v3.2.0-56c8g", "container.googleapis.com/pod_name": "plank-9f6cb7fbb-4jdf2", "container.googleapis.com/stream": "stderr", "container.googleapis.com/namespace_name": "default" }, "logName": "projects/k8s-prow/logs/plank", "receiveTimestamp": "2019-03-03T08:02:01.492391989Z" } ] ```
stevekuznetsov commented 5 years ago

Wait, so were there actually multiple versions of the test running at once?

mm4tt commented 5 years ago

Matt Matejczyk FYI the best place to escalate something like this is in #testing-ops on Slack, pinging @test-infra-oncall. I've shot a message over here

Thanks, @stevekuznetsov. Will keep that in mind for the future.

There are other jobs behaving similarly to this one, i.e. they are scheduled and run when they shouldn't be run.

could you list them?

@BenTheElder, other examples

name: ci-kubernetes-e2e-gce-scale-performance config rbkkcuqztpa

Job should be run once Mon-Fri, but recently there are days when it's run twice or thrice: d0dofwbipr5

name: ci-kubernetes-e2e-gke-large-performance config weyhn28fg2c

Job is supposed to be run once every Sunday, but last yesterday was launched twice: k4njvwksfjk

There are probably a few more.

Were you able to figure out what is going on?

krzyzacy commented 5 years ago

/cc

krzyzacy commented 5 years ago
  component:  "plank"   
  job:  "ci-kubernetes-e2e-gce-scale-performance"   
  level:  "info"   
  msg:  "Pod is missing, starting a new pod"   

We probably hit the OOMKilled again?

krzyzacy commented 5 years ago
E  I0304 18:02:18.055] Call:  gsutil -q -h Content-Type:application/json -h x-goog-if-generation-match:1551706405925783 cp /tmp/gsutil_h1mRnW gs://kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-performance/jobResultsCache.json 
E  I0304 18:02:19.881] process 693067 exited with code 0 after 0.0m 
E  I0304 18:02:19.884] Call:  gsutil -q -h Content-Type:application/json cp /tmp/gsutil_7WyV5x gs://kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-performance/324/finished.json 
E  I0304 18:02:21.640] process 693245 exited with code 0 after 0.0m 
E  I0304 18:02:21.641] Call:  gsutil -q -h Content-Type:text/plain -h 'Cache-Control:private, max-age=0, no-transform' cp /tmp/gsutil_tpumdl gs://kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-performance/latest-build.txt 
E  I0304 18:02:23.301] process 693423 exited with code 0 after 0.0m 
E  I0304 18:02:23.302] Call:  gsutil -q cp -Z /workspace/build-log.txt gs://kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-performance/324/build-log.txt 
E  I0304 18:02:25.317] process 693601 exited with code 0 after 0.0m 
E  + EXIT_VALUE=1 
E  + set +o xtrace 
E  Cloning into 'test-infra'... 
E  Activated service account credentials for: [pr-kubekins@kubernetes-jenkins-pull.iam.gserviceaccount.com] 
E  fatal: Not a git repository (or any of the parent directories): .git 

@stevekuznetsov seems the pod finished and exited properly? Seems a bug in plank?

krzyzacy commented 5 years ago

Horologium triggered the job properly afaik

also - no associated logs in sinker (so - who deleted the pod?)

stevekuznetsov commented 5 years ago

@krzyzacy what was the behavior? Plank will create a Pod if one does not exist and the ProwJob is not marked in some completed state, can you try to determine via logs how the pod exited and what the state of the prowjob was at the time?

krzyzacy commented 5 years ago

The pod exited with 1 (with E + EXIT_VALUE=1) I believe...

I think the problem occurs after https://github.com/kubernetes/test-infra/pull/11477? (Feb.26 according to @mm4tt 's screenshot)

And I think the prowjob was still in pending state as I don't see any other state transition log:

2019-03-04 00:01:53.000 PST
{"msg":"CreatePod({{ } {bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[prow.k8s.io/id:bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 created-by-prow:true prow.k8s.io/type:periodic prow.k8s.io/job:ci-kubernetes-e2e-gce-scale-performance preset-e2e-scalability-common:t…
2019-03-04 00:01:53.000 PST
{"component":"plank","type":"periodic","from":"triggered","msg":"Transitioning states.","to":"pending","job":"ci-kubernetes-e2e-gce-scale-performance","level":"info","name":"bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78"}
2019-03-04 00:01:53.000 PST
{"msg":"ReplaceProwJob(bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 bc8b31ae-3e53-11e9-898b-42010a80003a 189526855 1 2019-03-04 08:01:37 +0000 UTC <…
2019-03-04 00:01:54.000 PST
{"component":"plank","msg":"GetProwJob(bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78)","client":"kube","level":"debug"}
2019-03-04 00:01:54.000 PST
{"component":"plank","msg":"ReplaceProwJob(bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 bc8b31ae-3e53-11e9-898b-42010a80003a 189526926 1 2019-03-04 …
2019-03-04 05:33:53.000 PST
{"msg":"CreatePod({{ } {bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[prow.k8s.io/type:periodic prow.k8s.io/job:ci-kubernetes-e2e-gce-scale-performance preset-e2e-scalability-common:true preset-k8s-ssh:true preset-service-account:true prow.k8s.io/id:bc8b0b…
2019-03-04 05:33:53.000 PST
{"msg":"Pod is missing, starting a new pod","job":"ci-kubernetes-e2e-gce-scale-performance","level":"info","name":"bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78","type":"periodic","component":"plank"}
2019-03-04 05:33:53.000 PST
{"component":"plank","msg":"ReplaceProwJob(bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 bc8b31ae-3e53-11e9-898b-42010a80003a 189526934 1 2019-03-04 …
2019-03-04 05:33:53.000 PST
{"component":"plank","msg":"GetProwJob(bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78)","client":"kube","level":"debug"}

2019-03-04 05:33:53.000 PST
{"component":"plank","msg":"ReplaceProwJob(bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 bc8b31ae-3e53-11e9-898b-42010a80003a 189621384 1 2019-03-04 …
2019-03-04 10:02:54.000 PST
{"client":"kube","level":"debug","component":"plank","msg":"CreatePod({{ } {bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[prow.k8s.io/id:bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 created-by-prow:true prow.k8s.io/type:periodic prow.k8s.io/job:ci-kubernetes-e2e-g…
2019-03-04 10:02:54.000 PST
{"msg":"Pod is missing, starting a new pod","job":"ci-kubernetes-e2e-gce-scale-performance","level":"info","name":"bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78","component":"plank","type":"periodic"}
2019-03-04 10:02:54.000 PST
{"component":"plank","msg":"ReplaceProwJob(bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 bc8b31ae-3e53-11e9-898b-42010a80003a 189621384 1 2019-03-04 …
2019-03-04 10:02:55.000 PST
{"msg":"GetProwJob(bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78)","client":"kube","level":"debug","component":"plank"}
2019-03-04 10:02:55.000 PST
{"component":"plank","msg":"ReplaceProwJob(bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78, {{ProwJob prow.k8s.io/v1} {bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 default /apis/prow.k8s.io/v1/namespaces/default/prowjobs/bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78 bc8b31ae-3e53-11e9-898b-42010a80003a 189698498 1 2019-03-04 …
krzyzacy commented 5 years ago

Edit: Just paste the full node log here:

Mar 04 18:02:25 gke-prow-containerd-pool-99179761-9sg5 containerd[1141]: time="2019-03-04T18:02:25Z" level=info msg="Finish piping stderr of container "edeb1523a687b7e5a80ca831b9760a1a6328be767b35a3197d6919752681fc2b""
Mar 04 18:02:25 gke-prow-containerd-pool-99179761-9sg5 containerd[1141]: time="2019-03-04T18:02:25Z" level=info msg="Finish piping stdout of container "edeb1523a687b7e5a80ca831b9760a1a6328be767b35a3197d6919752681fc2b""
Mar 04 18:02:25 gke-prow-containerd-pool-99179761-9sg5 containerd[1141]: time="2019-03-04T18:02:25Z" level=error msg="collecting metrics for edeb1523a687b7e5a80ca831b9760a1a6328be767b35a3197d6919752681fc2b" error="cgroups: cgroup deleted"
Mar 04 18:02:25 gke-prow-containerd-pool-99179761-9sg5 containerd[1141]: time="2019-03-04T18:02:25Z" level=info msg="shim reaped" id=edeb1523a687b7e5a80ca831b9760a1a6328be767b35a3197d6919752681fc2b
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 kubelet[1260]: I0304 18:02:26.074868    1260 kubelet.go:1883] SyncLoop (PLEG): "bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78_test-pods(2763ab6d-3e82-11e9-989e-42010a800133)", event: &pleg.PodLifecycleEvent{ID:"2763ab6d-3e82-11e9-989e-42010a800133", Type:"ContainerDied", Data:"edeb1523a687b7e5a80ca831b9760a1a6328be767b35a3197d6919752681fc2b"}
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 containerd[1141]: time="2019-03-04T18:02:26Z" level=info msg="StopPodSandbox for "b1fee6999e358a3e062e04f40ff5be40ac9d89b96146ece5e0ff8c541b485a4e""
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 containerd[1141]: time="2019-03-04T18:02:26Z" level=info msg="Container to stop "edeb1523a687b7e5a80ca831b9760a1a6328be767b35a3197d6919752681fc2b" is not running, current state "CONTAINER_EXITED""
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 systemd-networkd[350]: veth57837a20: Lost carrier
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 systemd-timesyncd[314]: Network configuration changed, trying to establish connection.
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 systemd-timesyncd[314]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 systemd-timesyncd[314]: Network configuration changed, trying to establish connection.
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 systemd-timesyncd[314]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 containerd[1141]: time="2019-03-04T18:02:26Z" level=info msg="TearDown network for sandbox "b1fee6999e358a3e062e04f40ff5be40ac9d89b96146ece5e0ff8c541b485a4e" successfully"
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 kubelet[1260]: I0304 18:02:26.228781    1260 reconciler.go:181] operationExecutor.UnmountVolume started for volume "ssh" (UniqueName: "kubernetes.io/secret/2763ab6d-3e82-11e9-989e-42010a800133-ssh") pod "2763ab6d-3e82-11e9-989e-42010a800133" (UID: "2763ab6d-3e82-11e9-989e-42010a800133")
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 kubelet[1260]: I0304 18:02:26.228850    1260 reconciler.go:181] operationExecutor.UnmountVolume started for volume "service" (UniqueName: "kubernetes.io/secret/2763ab6d-3e82-11e9-989e-42010a800133-service") pod "2763ab6d-3e82-11e9-989e-42010a800133" (UID: "2763ab6d-3e82-11e9-989e-42010a800133")
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 kubelet[1260]: I0304 18:02:26.246068    1260 operation_generator.go:688] UnmountVolume.TearDown succeeded for volume "kubernetes.io/secret/2763ab6d-3e82-11e9-989e-42010a800133-ssh" (OuterVolumeSpecName: "ssh") pod "2763ab6d-3e82-11e9-989e-42010a800133" (UID: "2763ab6d-3e82-11e9-989e-42010a800133"). InnerVolumeSpecName "ssh". PluginName "kubernetes.io/secret", VolumeGidValue ""
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 kubelet[1260]: I0304 18:02:26.247230    1260 operation_generator.go:688] UnmountVolume.TearDown succeeded for volume "kubernetes.io/secret/2763ab6d-3e82-11e9-989e-42010a800133-service" (OuterVolumeSpecName: "service") pod "2763ab6d-3e82-11e9-989e-42010a800133" (UID: "2763ab6d-3e82-11e9-989e-42010a800133"). InnerVolumeSpecName "service". PluginName "kubernetes.io/secret", VolumeGidValue ""
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 kubelet[1260]: I0304 18:02:26.329170    1260 reconciler.go:301] Volume detached for volume "service" (UniqueName: "kubernetes.io/secret/2763ab6d-3e82-11e9-989e-42010a800133-service") on node "gke-prow-containerd-pool-99179761-9sg5" DevicePath ""
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 kubelet[1260]: I0304 18:02:26.329219    1260 reconciler.go:301] Volume detached for volume "ssh" (UniqueName: "kubernetes.io/secret/2763ab6d-3e82-11e9-989e-42010a800133-ssh") on node "gke-prow-containerd-pool-99179761-9sg5" DevicePath ""
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 containerd[1141]: time="2019-03-04T18:02:26Z" level=info msg="shim reaped" id=b1fee6999e358a3e062e04f40ff5be40ac9d89b96146ece5e0ff8c541b485a4e
Mar 04 18:02:26 gke-prow-containerd-pool-99179761-9sg5 containerd[1141]: time="2019-03-04T18:02:26Z" level=info msg="StopPodSandbox for "b1fee6999e358a3e062e04f40ff5be40ac9d89b96146ece5e0ff8c541b485a4e" returns successfully"
Mar 04 18:02:27 gke-prow-containerd-pool-99179761-9sg5 kubelet[1260]: I0304 18:02:27.075432    1260 kubelet.go:1883] SyncLoop (PLEG): "bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78_test-pods(2763ab6d-3e82-11e9-989e-42010a800133)", event: &pleg.PodLifecycleEvent{ID:"2763ab6d-3e82-11e9-989e-42010a800133", Type:"ContainerDied", Data:"b1fee6999e358a3e062e04f40ff5be40ac9d89b96146ece5e0ff8c541b485a4e"}
Mar 04 18:02:27 gke-prow-containerd-pool-99179761-9sg5 kubelet[1260]: W0304 18:02:27.075574    1260 pod_container_deletor.go:75] Container "b1fee6999e358a3e062e04f40ff5be40ac9d89b96146ece5e0ff8c541b485a4e" not found in pod's containers
Mar 04 18:02:33 gke-prow-containerd-pool-99179761-9sg5 kubelet[1260]: I0304 18:02:33.622347    1260 kubelet.go:1854] SyncLoop (DELETE, "api"): "bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78_test-pods(2763ab6d-3e82-11e9-989e-42010a800133)"
Mar 04 18:02:33 gke-prow-containerd-pool-99179761-9sg5 kubelet[1260]: I0304 18:02:33.623922    1260 kubelet.go:1848] SyncLoop (REMOVE, "api"): "bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78_test-pods(2763ab6d-3e82-11e9-989e-42010a800133)"
Mar 04 18:02:33 gke-prow-containerd-pool-99179761-9sg5 kubelet[1260]: I0304 18:02:33.624053    1260 kubelet.go:2042] Failed to delete pod "bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78_test-pods(2763ab6d-3e82-11e9-989e-42010a800133)", err: pod not found
stevekuznetsov commented 5 years ago

If the pod exited, are you seeing sinker clean it up? If not, do you have audit logging on? Do you know what deleted the Pod?

krzyzacy commented 5 years ago

I don't see the clean up from sinker. How do I check the audit log?

stevekuznetsov commented 5 years ago

Doc is here: https://kubernetes.io/docs/tasks/debug-application-cluster/audit/

krzyzacy commented 5 years ago

(we don't have access to the master apiserver though...)

stevekuznetsov commented 5 years ago

Then that's a no-go. Interesting that sinker did not delete the Pod

krzyzacy commented 5 years ago

The prow bump job (https://testgrid.k8s.io/sig-testing-prow#autobump-prow) also uses cron but didn't have this issue - so I'd narrow it down to these large scalability jobs... :thinking: really confused here...

krzyzacy commented 5 years ago

@stevekuznetsov

2019-03-04 10:02:33.623 PST
k8s.io
delete
test-pods:bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78
system:serviceaccount:kube-system:pod-garbage-collector
{"@type":"type.googleapis.com/google.cloud.audit.AuditLog","authenticationInfo":{"principalEmail":"system:serviceaccount:kube-system:pod-garbage-collector"},"authorizationInfo":[{"granted":true,"permission":"io.k8s.core.v1.pods.delete","resource":"core/v1/namespaces/test-pods/pods/bc8b0b06-3e53-11e9…
2019-03-04 10:02:54.937 PST
k8s.io
create
test-pods:bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78
client
{"@type":"type.googleapis.com/google.cloud.audit.AuditLog","authenticationInfo":{"principalEmail":"client"},"authorizationInfo":[{"granted":true,"permission":"io.k8s.core.v1.pods.create","resource":"core/v1/namespaces/test-pods/pods/bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78"}],"methodName":"io.k8s.core.v…
2019-03-04 10:02:54.942 PST
k8s.io
create
test-pods:bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78:bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78
system:kube-scheduler
{"@type":"type.googleapis.com/google.cloud.audit.AuditLog","authenticationInfo":{"principalEmail":"system:kube-scheduler"},"authorizationInfo":[{"granted":true,"permission":"io.k8s.core.v1.pods.binding.create","resource":"core/v1/namespaces/test-pods/pods/bc8b0b06-3e53-11e9-9c9a-0a580a6c0e78/binding…
krzyzacy commented 5 years ago

hummmmmm...

gke has TERMINATED_POD_GC_THRESHOLD at 1000 instead of the default 12500 (https://github.com/kubernetes/kubernetes/blob/1b28775db1290a772967d192a19a8ec447053cd5/pkg/controller/apis/config/v1alpha1/defaults.go#L215) (thanks @Random-Liu helped located the issue)

time to have multiple build clusters? :sob: :disappointed: :cry: :joy:

Also we have a ton of imagePullBackOff pods:

senlu@senlu:~/work/src/k8s.io/test-infra/prow$ kubectl get po -n=test-pods | wc -l
1487
senlu@senlu:~/work/src/k8s.io/test-infra/prow$ kubectl get po -n=test-pods | grep ImagePullBackOff | wc -l
232

even better we have things like:

bb6ea232-1adc-11e9-8f0d-0a580a6c02f3         1/1       Running                 0          45d
0272c8e4-3d0c-11e9-9c9a-0a580a6c0e78         1/2       Error                   0          2d
e861282b-3c65-11e9-9c9a-0a580a6c0e78         0/1       ImagePullBackOff        0          3d
krzyzacy commented 5 years ago

(proposed a discussion topic in tomorrow's sig-testing for this)

mm4tt commented 5 years ago

/sig scalability

krzyzacy commented 5 years ago
senlu@senlu:~/work/src/k8s.io/test-infra/prow$ kubectl get po -n=test-pods | wc -l
764

with last few fixes it should be fine for now - we still need to figure out how to we bump/workaround that limit since we are (inevitably) going to have more and more jobs.

mm4tt commented 5 years ago

I think it's happening again qemxvmsjcm8

Last time it also started around Friday, is it possible that we run more prow jobs over weekend?

mm4tt commented 5 years ago

I'm almost 100% sure It's happening again mzcsqiyqykf

@krzyzacy, could you check?

/priority critical-urgent

krzyzacy commented 5 years ago

/shrug... plausibly code freeze is coming and we were having heavier testing loads yesterday... let me verify that's still the same issue

stevekuznetsov commented 5 years ago

@krzyzacy y'all might want a pager to go off when you get close ;)

krzyzacy commented 5 years ago

/assign

spiffxp commented 5 years ago

/milestone v1.15 Is this still a concern for us?

krzyzacy commented 5 years ago

The root issue is still there, I suspect we'll be hitting this again when testing volume increases..

stevekuznetsov commented 5 years ago

You can limit the maximum concurrency through plank to 1000, right?

spiffxp commented 5 years ago

/remove-priority critical-urgent This isn't at drop-everything priority, but we may hit this again this quarter

stevekuznetsov commented 5 years ago

@cjwagner can we set the plank concurrency and make sure you don't hit this again?

stevekuznetsov commented 5 years ago

That's global and not per cluster so it's a bit overkill but would at least stop you from getting evicted

krzyzacy commented 5 years ago

@stevekuznetsov what happens when plank hits the pod limit? Stop creating new pods?

stevekuznetsov commented 5 years ago

Yep, the controller will just not trigger new jobs and they will stay in Pending until there is room.

krzyzacy commented 5 years ago

I can smell snowballing :-p

cjwagner commented 5 years ago

I'm fine with enabling it if others are. This would prevent us from hitting the concurrency limit, but its not the perfect tool for the job. It could cause snowballing if our actual concurrency level is significantly higher due to executing on multiple build clusters and we'd need to set it to 1000 (not something higher) if we want to guarantee that no individual build cluster exceeds 1000 pods.

stevekuznetsov commented 5 years ago

We should just implement a per-cluster throttle then instead of a global one. Of course you'd get snowballing but you can't really help that. At least the snowball failure mode is very soft and you'd just run the jobs later

mm4tt commented 5 years ago

Folks, looks like this is still happening and heavily affecting our scalability CI tests

Sb6zUoUSksr

Currently we're facing a few major regressions (e.g. https://github.com/kubernetes/kubernetes/issues/75833, https://github.com/kubernetes/kubernetes/issues/76579) and not having working CI tests is really slowing us down. There is a big risk that if we don't debug the regressions it will block the 1.15 kubernetes release.

I understand that the issue may be hard to fix properly, but we'd really appreciate if you could come up with some kind of a temporary work-around to unblock us. Is there anything you could do?

krzyzacy commented 5 years ago

We can run all scalability jobs on a separate build cluster as a bandaid for now, WDYT @cjwagner ?

krzyzacy commented 5 years ago

@mm4tt @wojtek-t thoughts?

cjwagner commented 5 years ago

We can run all scalability jobs on a separate build cluster as a bandaid for now, WDYT @cjwagner ?

I think that's all we can do for now. Switching plank to use the informer framework might let it win the race with the pod GC sometimes, but it may not help at all and even if it did there would still be a race.

Adding another build cluster and migrating the necessary secrets is probably the fastest work around.