knative / build

A Kubernetes-native Build resource.
Apache License 2.0
575 stars 159 forks source link

very slow concurrent builds #613

Open sebgoa opened 5 years ago

sebgoa commented 5 years ago

/kind question

We are trying to do a lot of concurrent builds. What we are seeing is that pod remain pending for several minutes then the builds start to complete very slowly, despite the cluster having enough resources.

No errors appear in the controller or the pods, nowhere. So everything seems fine.

But we would expect the builds to start right away and complete quickly.

A basic test with the following build.yaml:

apiVersion: build.knative.dev/v1alpha1
kind: Build
metadata:
  name: REPLACE
spec:
  serviceAccountName: default
  source:
    git:
      revision: master
      url: https://github.com/mchmarny/simple-app
  template:
    arguments:
    - name: IMAGE
      value: knative.registry.svc.cluster.local/tzununbekov/foo-0-bar
    kind: BuildTemplate
    name: kaniko

Using the kaniko template:

kubectl apply -f https://raw.githubusercontent.com/triggermesh/build-templates/master/kaniko/kaniko.yaml

And running 50 concurrent builds with:

for i in {0..50}; do sed -e "s/REPLACE/build-${i}-foo/g" /tmp/build.yaml | kubectl apply -f - & done

A few builds complete ~10, but ultimately ~40 of them timeout after 10 minutes.

$ kubectl -n sebgoa get pods
NAME                            READY   STATUS        RESTARTS   AGE
build-0-foo-pod-1f289a          0/1     Completed     0          11m
build-1-foo-pod-1eca70          0/1     Completed     0          10m
build-10-foo-pod-7cf35d         0/1     Terminating   0          10m
build-11-foo-pod-2c53de         0/1     Completed     0          10m
build-12-foo-pod-21e92a         0/1     Completed     0          10m
build-14-foo-pod-dc76e7         0/1     Terminating   0          10m
build-15-foo-pod-23ffbd         0/1     Terminating   0          10m
build-16-foo-pod-b36814         0/1     Completed     0          10m
build-18-foo-pod-75dd8b         0/1     Completed     0          10m
build-19-foo-pod-7a14f8         0/1     Terminating   0          10m
build-2-foo-pod-c72e9a          0/1     Completed     0          10m
build-20-foo-pod-e2d9da         0/1     Completed     0          10m
build-21-foo-pod-afc242         0/1     Completed     0          10m
build-22-foo-pod-19601e         0/1     Completed     0          10m
build-23-foo-pod-8b614e         0/1     Completed     0          10m
build-24-foo-pod-7a2d4e         0/1     Terminating   0          10m
build-25-foo-pod-dccd5b         0/1     Terminating   0          10m
build-26-foo-pod-8e2d91         0/1     Terminating   0          10m
build-28-foo-pod-7fa41a         0/1     Terminating   0          10m
build-29-foo-pod-8e796e         0/1     Init:2/3      0          9m56s
build-3-foo-pod-6c5fe4          0/1     Terminating   0          10m
build-30-foo-pod-3e4a0b         0/1     Terminating   0          10m
build-32-foo-pod-b1fe60         0/1     Completed     0          10m
build-33-foo-pod-477816         0/1     Terminating   0          10m
build-34-foo-pod-caa536         0/1     Terminating   0          10m
build-35-foo-pod-a7209e         0/1     Terminating   0          10m
build-36-foo-pod-cf09c4         0/1     Terminating   0          10m
build-37-foo-pod-d23ead         0/1     Completed     0          10m
build-38-foo-pod-970900         0/1     Init:2/3      0          9m59s
build-39-foo-pod-c45c77         0/1     Init:2/3      0          9m51s
build-4-foo-pod-e76e78          0/1     Completed     0          10m
build-40-foo-pod-dc112a         0/1     Terminating   0          10m
build-41-foo-pod-bddec0         0/1     Terminating   0          10m
build-42-foo-pod-335cc4         0/1     Terminating   0          10m
build-43-foo-pod-5c6cf9         0/1     Completed     0          10m
build-44-foo-pod-e31788         0/1     Terminating   0          10m
build-45-foo-pod-5b5d67         0/1     Init:2/3      0          9m55s
build-46-foo-pod-7a3e4b         0/1     Terminating   0          10m
build-47-foo-pod-043e93         0/1     Init:2/3      0          9m58s
build-48-foo-pod-ff557a         0/1     Completed     0          9m52s
build-49-foo-pod-1a436b         0/1     Terminating   0          10m
build-5-foo-pod-4ba973          0/1     Completed     0          10m
build-50-foo-pod-683a52         0/1     Init:2/3      0          9m53s
build-6-foo-pod-da90e6          0/1     Completed     0          10m
build-7-foo-pod-fdb93b          0/1     Completed     0          10m
build-8-foo-pod-0b917b          0/1     Completed     0          10m
build-9-foo-pod-ff9a07          0/1     Terminating   0          10m

Is the github cloning possible being throttled which slows down the Build steps ?

sebgoa commented 5 years ago

cc/ @tzununbekov

chenleji commented 5 years ago

did you see the logs of building pod?

depends on the info you provide, the second init container is the throttled.

build-45-foo-pod-5b5d67 0/1 Init:2/3 0 9m55s

sebgoa commented 5 years ago

not quite, the second init does finish quickly , but the third (kaniko takes a long time):

$ kubectl -n sebgoa logs build-7-foo-pod-a16a31 -c build-step-build-and-push
INFO[0000] Downloading base image golang:latest         
ERROR: logging before flag.Parse: E0601 14:11:51.675380       1 metadata.go:142] while reading 'google-dockercfg' metadata: http status code: 404 while fetching url http://metadata.google.internal./computeMetadata/v1/instance/attributes/google-dockercfg
ERROR: logging before flag.Parse: E0601 14:11:51.717475       1 metadata.go:159] while reading 'google-dockercfg-url' metadata: http status code: 404 while fetching url http://metadata.google.internal./computeMetadata/v1/instance/attributes/google-dockercfg-url
2019/06/01 14:11:51 No matching credentials were found, falling back on anonymous
INFO[0223] Taking snapshot of full filesystem...
sebgoa commented 5 years ago

nodes are not maxed out:

$ kubectl -n sebgoa get pods -o wide
NAME                            READY   STATUS      RESTARTS   AGE     IP             NODE                                NOMINATED NODE
build-0-foo-pod-4487c7          0/1     Init:2/3    0          5m58s   10.24.30.44    gke-tm-workshop-7663aa83-5ql6       <none>
build-1-foo-pod-f1aa03          0/1     Completed   0          4m58s   10.24.34.200   gke-tm-default-pool-8ad990f1-0vqd   <none>
build-10-foo-pod-555a23         0/1     Completed   0          5m46s   10.24.35.14    gke-tm-workshop-7663aa83-1dgg       <none>
build-11-foo-pod-402f86         0/1     Init:2/3    0          6m1s    10.24.30.42    gke-tm-workshop-7663aa83-5ql6       <none>
build-12-foo-pod-1d2e16         0/1     Init:2/3    0          5m44s   10.24.30.50    gke-tm-workshop-7663aa83-5ql6       <none>
build-13-foo-pod-0b5869         0/1     Init:2/3    0          5m49s   10.24.30.49    gke-tm-workshop-7663aa83-5ql6       <none>
build-14-foo-pod-575cad         0/1     Init:2/3    0          5m52s   10.24.30.48    gke-tm-workshop-7663aa83-5ql6       <none>
build-15-foo-pod-966c19         0/1     Completed   0          5m48s   10.24.32.45    gke-tm-default-pool-8ad990f1-vg8k   <none>
build-16-foo-pod-0cd7a0         0/1     Init:2/3    0          6m4s    10.24.30.40    gke-tm-workshop-7663aa83-5ql6       <none>
build-17-foo-pod-676bb8         0/1     Init:2/3    0          5m23s   10.24.30.53    gke-tm-workshop-7663aa83-5ql6       <none>
build-18-foo-pod-70b156         0/1     Init:2/3    0          5m37s   10.24.35.16    gke-tm-workshop-7663aa83-1dgg       <none>
build-19-foo-pod-570467         0/1     Init:2/3    0          5m41s   10.24.30.51    gke-tm-workshop-7663aa83-5ql6       <none>
build-2-foo-pod-4c1b6b          0/1     Completed   0          5m47s   10.24.34.194   gke-tm-default-pool-8ad990f1-0vqd   <none>
build-20-foo-pod-cc851c         0/1     Completed   0          5m26s   10.24.34.196   gke-tm-default-pool-8ad990f1-0vqd   <none>
build-21-foo-pod-1949bc         0/1     Init:2/3    0          5m18s   10.24.30.55    gke-tm-workshop-7663aa83-5ql6       <none>
build-22-foo-pod-3eaf98         0/1     Init:2/3    0          6m      10.24.30.43    gke-tm-workshop-7663aa83-5ql6       <none>
build-23-foo-pod-6eac3b         0/1     Init:2/3    0          6m3s    10.24.30.41    gke-tm-workshop-7663aa83-5ql6       <none>
build-24-foo-pod-742b98         0/1     Init:2/3    0          5m32s   10.24.35.20    gke-tm-workshop-7663aa83-1dgg       <none>
build-25-foo-pod-cf0dd5         0/1     Init:2/3    0          5m30s   10.24.30.52    gke-tm-workshop-7663aa83-5ql6       <none>
build-26-foo-pod-75176e         0/1     Completed   0          5m51s   10.24.32.44    gke-tm-default-pool-8ad990f1-vg8k   <none>
build-27-foo-pod-db279d         0/1     Init:2/3    0          5m12s   10.24.33.208   gke-tm-default-pool-8ad990f1-bqrc   <none>
build-28-foo-pod-232436         0/1     Init:2/3    0          5m19s   10.24.35.22    gke-tm-workshop-7663aa83-1dgg       <none>
build-29-foo-pod-580cb3         0/1     Init:2/3    0          5m34s   10.24.35.18    gke-tm-workshop-7663aa83-1dgg       <none>
build-3-foo-pod-ab5bc6          0/1     Init:2/3    0          5m57s   10.24.30.45    gke-tm-workshop-7663aa83-5ql6       <none>
build-30-foo-pod-aa3c82         0/1     Init:2/3    0          5m20s   10.24.30.54    gke-tm-workshop-7663aa83-5ql6       <none>
build-31-foo-pod-899451         0/1     Init:2/3    0          5m29s   10.24.32.47    gke-tm-default-pool-8ad990f1-vg8k   <none>
build-32-foo-pod-177227         0/1     Completed   0          5m38s   10.24.34.195   gke-tm-default-pool-8ad990f1-0vqd   <none>
build-33-foo-pod-961803         0/1     Init:2/3    0          5m9s    10.24.32.51    gke-tm-default-pool-8ad990f1-vg8k   <none>
build-34-foo-pod-53fe76         0/1     Completed   0          5m40s   10.24.35.15    gke-tm-workshop-7663aa83-1dgg       <none>
build-35-foo-pod-a35d2c         0/1     Completed   0          5m15s   10.24.34.197   gke-tm-default-pool-8ad990f1-0vqd   <none>
build-36-foo-pod-95e435         0/1     Init:2/3    0          5m21s   10.24.32.48    gke-tm-default-pool-8ad990f1-vg8k   <none>
build-37-foo-pod-b91a11         0/1     Init:2/3    0          5m33s   10.24.35.19    gke-tm-workshop-7663aa83-1dgg       <none>
build-38-foo-pod-ae429d         0/1     Completed   0          5m7s    10.24.34.199   gke-tm-default-pool-8ad990f1-0vqd   <none>
build-39-foo-pod-508a6b         0/1     Init:2/3    0          5m13s   10.24.32.49    gke-tm-default-pool-8ad990f1-vg8k   <none>
build-4-foo-pod-9e2bc4          0/1     Init:2/3    0          5m35s   10.24.35.17    gke-tm-workshop-7663aa83-1dgg       <none>
build-40-foo-pod-1ead60         0/1     Init:2/3    0          5m16s   10.24.33.207   gke-tm-default-pool-8ad990f1-bqrc   <none>
build-41-foo-pod-031e18         0/1     Init:2/3    0          5m1s    10.24.32.50    gke-tm-default-pool-8ad990f1-vg8k   <none>
build-42-foo-pod-1685c9         0/1     Init:2/3    0          5m5s    10.24.33.210   gke-tm-default-pool-8ad990f1-bqrc   <none>
build-43-foo-pod-5548f4         0/1     Init:2/3    0          5m24s   10.24.35.21    gke-tm-workshop-7663aa83-1dgg       <none>
build-44-foo-pod-baf4e7         0/1     Init:2/3    0          5m4s    10.24.35.23    gke-tm-workshop-7663aa83-1dgg       <none>
build-45-foo-pod-dd1322         0/1     Completed   0          5m10s   10.24.34.198   gke-tm-default-pool-8ad990f1-0vqd   <none>
build-46-foo-pod-49cd1b         0/1     Init:2/3    0          4m56s   10.24.33.212   gke-tm-default-pool-8ad990f1-bqrc   <none>
build-47-foo-pod-776c3a         0/1     Init:2/3    0          4m56s   10.24.30.57    gke-tm-workshop-7663aa83-5ql6       <none>
build-48-foo-pod-0c7438         0/1     Init:2/3    0          5m6s    10.24.33.209   gke-tm-default-pool-8ad990f1-bqrc   <none>
build-49-foo-pod-1ab9fa         0/1     Init:2/3    0          4m59s   10.24.33.211   gke-tm-default-pool-8ad990f1-bqrc   <none>
build-5-foo-pod-69bc56          0/1     Completed   0          5m43s   10.24.32.46    gke-tm-default-pool-8ad990f1-vg8k   <none>
build-50-foo-pod-c224d3         0/1     Init:2/3    0          5m2s    10.24.30.56    gke-tm-workshop-7663aa83-5ql6       <none>
build-6-foo-pod-8b55fe          0/1     Init:2/3    0          6m5s    10.24.30.39    gke-tm-workshop-7663aa83-5ql6       <none>
build-7-foo-pod-a16a31          0/1     Init:2/3    0          5m54s   10.24.30.47    gke-tm-workshop-7663aa83-5ql6       <none>
build-8-foo-pod-21dd71          0/1     Init:2/3    0          5m55s   10.24.30.46    gke-tm-workshop-7663aa83-5ql6       <none>
build-9-foo-pod-8cfa1d          0/1     Init:2/3    0          5m27s   10.24.33.206   gke-tm-default-pool-8ad990f1-bqrc   <none>
dsfsdf-llh52-85d7954675-dtt5l   1/1     Running     0          2d5h    10.24.34.130   gke-tm-default-pool-8ad990f1-0vqd   <none>
sebair: aws $ kubectl top nodes
NAME                                CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
gke-tm-default-pool-8ad990f1-0vqd   289m         7%     3731Mi          30%       
gke-tm-default-pool-8ad990f1-bqrc   1020m        26%    7316Mi          58%       
gke-tm-default-pool-8ad990f1-vg8k   3753m        95%    6179Mi          49%       
gke-tm-workshop-7663aa83-1dgg       1152m        29%    7137Mi          57%       
gke-tm-workshop-7663aa83-5ql6       2216m        56%    6270Mi          50%