istio / old_issues_repo

Deprecated issue-tracking repo, please post new issues or feature requests to istio/istio instead.
36 stars 9 forks source link

Istio on GKE installation instructions produce broken cluster #262

Open danderson opened 6 years ago

danderson commented 6 years ago

Is this a BUG or FEATURE REQUEST?:

Bug.

Did you review https://istio.io/help/ and existing issues to identify if this is already solved or being worked on?:

Yes, reviewed. No, doesn't help.

Bug: Y

What Version of Istio and Kubernetes are you using, where did you get Istio from, Installation details

Version: 0.6.0
GitRevision: 2cb09cdf040a8573330a127947b11e5082619895
User: root@a28f609ab931
Hub: docker.io/istio
GolangVersion: go1.9
BuildStatus: Clean
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:55:54Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9+", GitVersion:"v1.9.2-gke.1", GitCommit:"4ce7af72d8d343ea2f7680348852db641ff573af", GitTreeState:"clean", BuildDate:"2018-01-31T22:30:55Z", GoVersion:"go1.9.2b4", Compiler:"gc", Platform:"linux/amd64"}

Is Istio Auth enabled or not ? Deployed a GKE cluster with istio, via Deployment Manager, per https://istio.io/docs/setup/kubernetes/quick-start-gke-dm.html . All deployment manager tweakables and checkboxes left at their default value, I just clicked straight through to "deploy".

What happened: After much waiting, the "config waiter" stage of the deployment times out and fails.

The GKE cluster is up and working, but the Istio sidecar injector is looping on trying (and failing) to start.

$ kubectl get po -n istio-system
NAME                                      READY     STATUS              RESTARTS   AGE
grafana-89f97d9c-6lkmp                    1/1       Running             0          9m
istio-ca-59f6dcb7d9-wwc2x                 1/1       Running             0          19m
istio-ingress-56dd45597b-6qpbz            1/1       Running             0          19m
istio-mixer-7f5dcf8db4-kzlpm              3/3       Running             0          19m
istio-pilot-7ddb95dc8f-lsr8b              2/2       Running             0          19m
istio-sidecar-injector-7947777478-kthf9   0/1       ContainerCreating   0          19m
prometheus-cf8456855-dt66q                1/1       Running             0          9m
servicegraph-59ff5dbbff-t7s5x             1/1       Running             0          9m
zipkin-7988c559b7-m82z8                   1/1       Running             0          9m

It would appear that there is a missing secret, and after 30+ minutes nothing seems to be interested in creating that secret:

$ kubectl describe -n istio-system po istio-sidecar-injector-7947777478-kthf9
Name:           istio-sidecar-injector-7947777478-kthf9
Namespace:      istio-system
Node:           gke-istio-cluster-default-pool-03b26a19-f9p9/10.128.0.6
Start Time:     Fri, 30 Mar 2018 14:11:57 -0700
Labels:         istio=sidecar-injector
                pod-template-hash=3503333034
Annotations:    <none>
Status:         Pending
IP:             
Controlled By:  ReplicaSet/istio-sidecar-injector-7947777478
Containers:
  webhook:
    Container ID:  
    Image:         docker.io/istio/sidecar_injector:0.6.0
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Args:
      --tlsCertFile=/etc/istio/certs/cert.pem
      --tlsKeyFile=/etc/istio/certs/key.pem
      --injectConfig=/etc/istio/inject/config
      --meshConfig=/etc/istio/config/mesh
      --healthCheckInterval=2s
      --healthCheckFile=/health
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Liveness:       exec [/usr/local/bin/sidecar-injector probe --probe-path=/health --interval=2s] delay=4s timeout=1s period=4s #success=1 #failure=3
    Readiness:      exec [/usr/local/bin/sidecar-injector probe --probe-path=/health --interval=2s] delay=4s timeout=1s period=4s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /etc/istio/certs from certs (ro)
      /etc/istio/config from config-volume (ro)
      /etc/istio/inject from inject-config (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from istio-sidecar-injector-service-account-token-5kxkh (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      istio
    Optional:  false
  certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  sidecar-injector-certs
    Optional:    false
  inject-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      istio-inject
    Optional:  false
  istio-sidecar-injector-service-account-token-5kxkh:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  istio-sidecar-injector-service-account-token-5kxkh
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                 Age                From                                                   Message
  ----     ------                 ----               ----                                                   -------
  Normal   Scheduled              20m                default-scheduler                                      Successfully assigned istio-sidecar-injector-7947777478-kthf9 to gke-istio-cluster-default-pool-03b26a19-f9p9
  Normal   SuccessfulMountVolume  20m                kubelet, gke-istio-cluster-default-pool-03b26a19-f9p9  MountVolume.SetUp succeeded for volume "inject-config"
  Normal   SuccessfulMountVolume  20m                kubelet, gke-istio-cluster-default-pool-03b26a19-f9p9  MountVolume.SetUp succeeded for volume "config-volume"
  Normal   SuccessfulMountVolume  20m                kubelet, gke-istio-cluster-default-pool-03b26a19-f9p9  MountVolume.SetUp succeeded for volume "istio-sidecar-injector-service-account-token-5kxkh"
  Warning  FailedMount            2m (x17 over 20m)  kubelet, gke-istio-cluster-default-pool-03b26a19-f9p9  MountVolume.SetUp failed for volume "certs" : secrets "sidecar-injector-certs" not found
  Warning  FailedMount            46s (x9 over 18m)  kubelet, gke-istio-cluster-default-pool-03b26a19-f9p9  Unable to mount volumes for pod "istio-sidecar-injector-7947777478-kthf9_istio-system(faf13902-345e-11e8-a19e-42010a8000c0)": timeout expired waiting for volumes to attach/mount for pod "istio-system"/"istio-sidecar-injector-7947777478-kthf9". list of unattached/unmounted volumes=[certs]

What you expected to happen:

Istio on GKE should install correctly when following official instructions.

How to reproduce it:

Follow installation instructions at https://istio.io/docs/setup/kubernetes/quick-start-gke-dm.html .

danderson commented 6 years ago

This smells a lot like istio/issues#261 , in that the failure mode is identical - the secret is never created. I find this surprising, because it implies that deployment manager is using a v1.10 kubectl under the hood, which I did not believe to be true.

jsenon commented 6 years ago

If your are stuck it will works with kubectl version 1.9

linsun commented 6 years ago

agree it does sound like istio/issues#261 cc @ayj for triage.

ayj commented 6 years ago

cc @selmanj - it looks like DM config might need to be updated to account for a change in behaviot in kubectl v1.10 (see https://github.com/kubernetes/kubectl/issues/384 and stio/issues#261).

selmanj commented 6 years ago

The DM template has an apt-get update && apt-get install -y git curl kubectl step at the beginning, so it's likely it's using the affected version of kubectl.

We could either force the install to use a previous version or patch ./install/kubernetes/webhook-patch-ca-bundle.sh. @ayj what do you think?

ayj commented 6 years ago

install/kubernetes/webhook-patch-ca-bundle.sh is going away in 0.8. @yusuoh replaced it with automatic cert provisioning using Istio CA.

If this needs to be patched for 0.7.0 pinning to a specific version might be easiest. Otherwise you'll need to add conditional checks in the scripts to optional prepend version/kind info.

selmanj commented 6 years ago

I don't think the DM template is updated for 0.7.0. I'll do that in a separate PR.

Let me see about pinning to a previous kubectl to avoid this issue for now.

selmanj commented 6 years ago

I'm unable to reproduce the issue. As an additional item, kubectl installed on the instance is version 1.7.5, not 1.10, so it seems the issue must be something else.

jsselman@istio-cluster-2-istio-cluster-2-vm:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5", GitCommit:"17d7182a7ccbb167074be7a87f0a68bd00d58d97", GitTreeState:"clean", BuildDate:"2017
-08-31T09:14:02Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
The connection to the server localhost:8080 was refused - did you specify the right host or port?
jsselman@istio-cluster-2-istio-cluster-2-vm:~$ apt-cache showpkg kubectl
Package: kubectl
Versions: 
1.7.5-00 (/var/lib/apt/lists/packages.cloud.google.com_apt_dists_cloud-sdk-jessie_main_binary-amd64_Packages) (/var/lib/dpkg/status)
 Description Language: 
                 File: /var/lib/apt/lists/packages.cloud.google.com_apt_dists_cloud-sdk-jessie_main_binary-amd64_Packages
                  MD5: fb58ab85a9089d0257cb8f7cda7d5a09
selmanj commented 6 years ago

I did notice a few issues with the script itself; for example it's using a version of GKE that is no longer supported, and the debian image used for the installer vm is outdated. Will send out PRs to fix those before continuing to investigate.

selmanj commented 6 years ago

Update; after resolving the issues on my local branch I was able to reproduce the issue; it DOES seem to be caused by the affected kubectl version (I'm not sure how I didn't run into it when I previously looked - maybe due to the older image?)

I'll update the script to use the previously-released kubectl and then send out a PR.

renperez commented 6 years ago

i'm running kubectl 1.10.0 and i am seeing this error. Unable to mount volumes for pod "istio-sidecar-injector-6ff9fb5698-82kpg_istio-system(58d89fa2-3929-11e8-a486-069e57407dac)": timeout expired waiting for volumes to attach/mount for pod "istio-system"/"istio-sidecar-injector-6ff9fb5698-82kpg". list of unattached/unmounted volumes=[certs]

MountVolume.SetUp failed for volume "certs" : secrets "sidecar-injector-certs" not found

renperez commented 6 years ago

==> v2beta1/HorizontalPodAutoscaler NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE istio-ingress Deployment/istio-ingress / 80% 2 8 2 3h

==> v1beta1/Deployment NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE istio-ingress 2 2 2 2 3h istio-mixer 1 1 1 1 3h istio-pilot 1 1 1 1 3h istio-ca 1 1 1 1 3h istio-sidecar-injector 1 1 1 0 3h

==> v1/Pod(related) NAME READY STATUS RESTARTS AGE istio-ingress-6d448d77f-ngcr4 1/1 Running 0 3h istio-ingress-6d448d77f-rzdkk 1/1 Running 0 3h istio-mixer-84bcf5f54-89rwr 3/3 Running 0 3h istio-pilot-6fdbbb5456-7llw8 2/2 Running 0 3h istio-ca-994b7849-cqj5j 1/1 Running 0 3h istio-sidecar-injector-6ff9fb5698-82kpg 0/1 ContainerCreating 0 7m

renperez commented 6 years ago

any update on this? this is also broken on helm install.

selmanj commented 6 years ago

Once #4781 is merged in, the GKE template should work; I'll let someone else comment regarding the helm install.