jetstack / tarmak

A toolkit for Kubernetes cluster provisioning and lifecycle management
Apache License 2.0
269 stars 49 forks source link

Unable to connect to API Server on cluster startup #129

Closed dippynark closed 6 years ago

dippynark commented 6 years ago

/kind bug

What happened: Created a cluster following the sequence of steps as described here and ran tarmak kubectl get nodes but received the following output:

$ tarmak kubectl get nodes
DEBU[0000] trying to connect to https://127.0.0.1:58100  app=tarmak
WARN[0000] error connecting to cluster: Get https://127.0.0.1:58100/version: dial tcp 127.0.0.1:58100: getsockopt: connection refused  app=tarmak
DEBU[0000] start tunnel cmd=[ssh -F /Users/luke/.tarmak/dev-cluster/ssh_config -O forward -L127.0.0.1:58703:api.dev-cluster.tarmak.local:6443 bastion]  app=tarmak destination=api.dev-cluster.tarmak.local
WARN[0000] starting ssh tunnel failed with error: exit status 255  app=tarmak destination=api.dev-cluster.tarmak.local
DEBU[0000] error connecting to tunnel: dial tcp 127.0.0.1:58703: getsockopt: connection refused  app=tarmak destination=api.dev-cluster.tarmak.local
DEBU[0000] error connecting to tunnel: dial tcp 127.0.0.1:58703: getsockopt: connection refused  app=tarmak destination=api.dev-cluster.tarmak.local
DEBU[0001] error connecting to tunnel: dial tcp 127.0.0.1:58703: getsockopt: connection refused  app=tarmak destination=api.dev-cluster.tarmak.local
DEBU[0001] error connecting to tunnel: dial tcp 127.0.0.1:58703: getsockopt: connection refused  app=tarmak destination=api.dev-cluster.tarmak.local
DEBU[0002] error connecting to tunnel: dial tcp 127.0.0.1:58703: getsockopt: connection refused  app=tarmak destination=api.dev-cluster.tarmak.local
DEBU[0002] error connecting to tunnel: dial tcp 127.0.0.1:58703: getsockopt: connection refused  app=tarmak destination=api.dev-cluster.tarmak.local
DEBU[0003] error connecting to tunnel: dial tcp 127.0.0.1:58703: getsockopt: connection refused  app=tarmak destination=api.dev-cluster.tarmak.local
DEBU[0003] error connecting to tunnel: dial tcp 127.0.0.1:58703: getsockopt: connection refused  app=tarmak destination=api.dev-cluster.tarmak.local
DEBU[0004] error connecting to tunnel: dial tcp 127.0.0.1:58703: getsockopt: connection refused  app=tarmak destination=api.dev-cluster.tarmak.local
DEBU[0004] error connecting to tunnel: dial tcp 127.0.0.1:58703: getsockopt: connection refused  app=tarmak destination=api.dev-cluster.tarmak.local

What you expected to happen: To see the list of cluster nodes

How to reproduce it (as minimally and precisely as possible): Run:

tarmak init
tarmak clusters images build
tarmak clusters apply
tarmak kubectl get nodes

Anything else we need to know?: I am working from my own fork of the code (https://github.com/dippynark/tarmak/tree/67-add-pod-security-policies) compiled on OS X

Environment:

simonswine commented 6 years ago

I think I found the bug. It seems like it's cleaning up the tunnel before executing kubectl:

https://github.com/jetstack/tarmak/blob/master/pkg/tarmak/kubectl/kubectl.go#L139

It was always doing that, but before the cleanup of the tunnel was not working.

I gess we needs to have a cleanUp boolean param or even a stopChannel somwhere around here:

https://github.com/jetstack/tarmak/blob/master/pkg/tarmak/kubectl/kubectl.go#L194

And then this only needs to happen after kubectl has been executed

simonswine commented 6 years ago

/assign @dippynark