kubernetes / minikube

Run Kubernetes locally
https://minikube.sigs.k8s.io/
Apache License 2.0
28.72k stars 4.81k forks source link

long-running tunnel breaks cluster connectivity: ssh: handshake failed: connection reset by peer #4240

Open ghost opened 5 years ago

ghost commented 5 years ago

The exact command to reproduce the issue: minikube ip

But the truth is, I kept minikube tunnel running for a while. I can access everything using kubectl, but when I want to do any command that communicates with the cluster in any way, it just hangs.

I have done nothing special, just minikube start, in an other window minikube tunnel.

I also tried minikube tunnel --cleanup,did not fail, but did nothing either.

Then i tried minikube ip -v=7: Using SSH client type: native &{{{ 0 [] [] []} docker [0x83b580] 0x83b550 [] 0s} 127.0.0.1 37871 } About to run SSH command: ip addr show

and after a while: Error dialing TCP: ssh: handshake failed: read tcp 127.0.0.1:35568->127.0.0.1:37871: read: connection reset by peer

The full output of the command that failed: It just hangs no output.

The output of the minikube logs command: That just hangs too.

The operating system version: Kubuntu 17.10

tstromberg commented 5 years ago

Sorry to hear that this is happening. Do you mind sharing the output of:

minikube status ip r s sudo iptables -S minikube tunnel --cleanup --alsologtostderr -v=8

Also, can you share the minikube start command-line used, as well as the output it showed? I'm curious if this is kvm2 or virtualbox.

ghost commented 5 years ago

I was using minikube with driver none, directly on my linux distro (kubuntu 17.10). In the meantime my problem was solved, not sure what fixed it, but using the extra config for resolv (to make sure it points to accessible DNS resolvers) in combination with clearing iptables and restarting docker/kubelet seems to have fixed my issues.

tstromberg commented 5 years ago

Closing as unreproducible. Please re-open if you see this problem again. It's very mysterious.

Eelis commented 4 years ago

I have the same issue.

Output of minikube status:

💣  Error getting bootstrapper: getting kubeadm bootstrapper: command runner: getting ssh client for bootstrapper: Error dialing tcp via ssh client: ssh: handshake failed: read tcp 127.0.0.1:60878->127.0.0.1:45871: read: connection reset by peer

Output of ip r s:

default via 192.168.0.1 dev enp3s0 proto dhcp metric 100 
169.254.0.0/16 dev virbr0 scope link metric 1000 linkdown 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
192.168.0.0/24 dev enp3s0 proto kernel scope link src 192.168.0.20 metric 100 
192.168.99.0/24 dev vboxnet0 proto kernel scope link src 192.168.99.1 
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown 

Output of sudo iptables -S:

-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-N DOCKER
-N DOCKER-ISOLATION-STAGE-1
-N DOCKER-ISOLATION-STAGE-2
-N DOCKER-USER
-A INPUT -i virbr0 -p udp -m udp --dport 53 -j ACCEPT
-A INPUT -i virbr0 -p tcp -m tcp --dport 53 -j ACCEPT
-A INPUT -i virbr0 -p udp -m udp --dport 67 -j ACCEPT
-A INPUT -i virbr0 -p tcp -m tcp --dport 67 -j ACCEPT
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A FORWARD -d 192.168.122.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.122.0/24 -i virbr0 -j ACCEPT
-A FORWARD -i virbr0 -o virbr0 -j ACCEPT
-A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable
-A OUTPUT -o virbr0 -p udp -m udp --dport 68 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN

Output of minikube tunnel --cleanup --alsologtostderr -v=8:

I0806 20:31:11.908621   12963 tunnel.go:48] Checking for tunnels to cleanup...

minikube start command:

minikube start --memory 10000 --disk-size 50g --dns-domain myapp-kube --extra-config=kubelet.cluster-domain=myapp-kube
Eelis commented 4 years ago

I noticed that when the problem occurs, I can fix everything by running:

vboxmanage controlvm minikube setlinkstate1 off
vboxmanage controlvm minikube setlinkstate1 on
tstromberg commented 4 years ago

This is still an issue in v1.4 as far as I know.

pollend commented 4 years ago

I'm still having this problem and I'm running v1.5.1.

olivierlemasle commented 4 years ago

Could be linked to #4151 ?

medyagh commented 4 years ago

I believe this is still an issue in v.1.6.1 since we didn't add any new code for tunnel

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

medyagh commented 4 years ago

I wonder if this issue exists on docker driver too ? anyone tried with docker driver?

the ssh connectivy keeping alive could still be

fejta-bot commented 4 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

sharifelgamal commented 3 years ago

This remains an issue I believe. /lifecycle frozen

pluveto commented 1 year ago

I ran into this when I installing minikube.

log:

sudo hostname minikube && echo "minikube" | sudo tee /etc/hostname
I0316 15:29:18.568636   24627 main.go:141] libmachine: Error dialing TCP: ssh: handshake failed: read tcp 127.0.0.1:50064->127.0.0.1:32797: read: connection reset by peer
I0316 15:31:31.688700   24627 main.go:141] libmachine: Error dialing TCP: ssh: handshake failed: read tcp 127.0.0.1:60178->127.0.0.1:32797: read: connection reset by peer
I0316 15:33:44.808708   24627 main.go:141] libmachine: Error dialing TCP: ssh: handshake failed: read tcp 127.0.0.1:42278->127.0.0.1:32797: read: connection reset by peer
╭─       root@ecs-ebe5    ~                                                                                                                            SIGINT(2) ↵  14.7G    0.02    15:33:19 
╰─ ip r s

default via 192.168.0.1 dev eth0 proto dhcp metric 100
169.254.169.254 via 192.168.0.1 dev eth0 proto dhcp metric 100
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.245 metric 100
192.168.49.0/24 dev br-d6ae2df29860 proto kernel scope link src 192.168.49.1
╭─       root@ecs-ebe5    ~                                                                                                                                        14.7G    0.02    15:33:23 
╰─ sudo iptables -S

-P INPUT ACCEPT
-P FORWARD DROP
-P OUTPUT ACCEPT
-N DOCKER
-N DOCKER-ISOLATION-STAGE-1
-N DOCKER-ISOLATION-STAGE-2
-N DOCKER-USER
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o br-d6ae2df29860 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o br-d6ae2df29860 -j DOCKER
-A FORWARD -i br-d6ae2df29860 ! -o br-d6ae2df29860 -j ACCEPT
-A FORWARD -i br-d6ae2df29860 -o br-d6ae2df29860 -j ACCEPT
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER -d 192.168.49.2/32 ! -i br-d6ae2df29860 -o br-d6ae2df29860 -p tcp -m tcp --dport 32443 -j ACCEPT
-A DOCKER -d 192.168.49.2/32 ! -i br-d6ae2df29860 -o br-d6ae2df29860 -p tcp -m tcp --dport 8443 -j ACCEPT
-A DOCKER -d 192.168.49.2/32 ! -i br-d6ae2df29860 -o br-d6ae2df29860 -p tcp -m tcp --dport 5000 -j ACCEPT
-A DOCKER -d 192.168.49.2/32 ! -i br-d6ae2df29860 -o br-d6ae2df29860 -p tcp -m tcp --dport 2376 -j ACCEPT
-A DOCKER -d 192.168.49.2/32 ! -i br-d6ae2df29860 -o br-d6ae2df29860 -p tcp -m tcp --dport 22 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i br-d6ae2df29860 ! -o br-d6ae2df29860 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o br-d6ae2df29860 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN
╭─       root@ecs-ebe5    ~                                                                                                                                        14.7G    0.02    15:33:34 
╰─ minikube tunnel --cleanup --alsologtostderr -v=8

I0316 15:33:49.296913   25030 out.go:296] Setting OutFile to fd 1 ...
I0316 15:33:49.297038   25030 out.go:348] isatty.IsTerminal(1) = true
I0316 15:33:49.297053   25030 out.go:309] Setting ErrFile to fd 2...
I0316 15:33:49.297060   25030 out.go:348] isatty.IsTerminal(2) = true
I0316 15:33:49.297166   25030 root.go:334] Updating PATH: /root/.minikube/bin
I0316 15:33:49.297367   25030 mustload.go:65] Loading cluster: minikube
I0316 15:33:49.297671   25030 config.go:180] Loaded profile config "minikube": Driver=docker, ContainerRuntime=docker, KubernetesVersion=v1.26.1
I0316 15:33:49.298015   25030 cli_runner.go:164] Run: docker container inspect minikube --format={{.State.Status}}
I0316 15:33:49.344649   25030 host.go:66] Checking if "minikube" exists ...
I0316 15:33:49.344859   25030 cli_runner.go:164] Run: docker system info --format "{{json .}}"
I0316 15:33:49.428177   25030 info.go:266] docker info: {ID:6d8873e2-970b-43de-a349-142d5b2d9518 Containers:1 ContainersRunning:1 ContainersPaused:0 ContainersStopped:0 Images:1 Driver:overlay2 DriverStatus:[[Backing Filesystem extfs] [Supports d_type true] [Using metacopy false] [Native Overlay Diff true] [userxattr false]] SystemStatus:<nil> Plugins:{Volume:[local] Network:[bridge host ipvlan macvlan null overlay] Authorization:<nil> Log:[awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:false KernelMemoryTCP:false CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true PidsLimit:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6Tables:true Debug:false NFd:31 OomKillDisable:false NGoroutines:37 SystemTime:2023-03-16 15:33:49.422163619 +0800 CST LoggingDriver:json-file CgroupDriver:systemd NEventsListener:0 KernelVersion:5.15.0-67-generic OperatingSystem:Ubuntu 22.04.2 LTS OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:{AllowNondistributableArtifactsCIDRs:[] AllowNondistributableArtifactsHostnames:[] InsecureRegistryCIDRs:[127.0.0.0/8] IndexConfigs:{DockerIo:{Name:docker.io Mirrors:[] Secure:true Official:true}} Mirrors:[]} NCPU:8 MemTotal:16576102400 GenericResources:<nil> DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:ecs-ebe5 Labels:[] ExperimentalBuild:false ServerVersion:23.0.1 ClusterStore: ClusterAdvertise: Runtimes:{Runc:{Path:runc}} DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:<nil>} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID:2456e983eb9e37e47538f59ea18f2043c9a73640 Expected:2456e983eb9e37e47538f59ea18f2043c9a73640} RuncCommit:{ID:v1.1.4-0-g5fd4c4d Expected:v1.1.4-0-g5fd4c4d} InitCommit:{ID:de40ad0 Expected:de40ad0} SecurityOptions:[name=apparmor name=seccomp,profile=builtin name=cgroupns] ProductLicense: Warnings:<nil> ServerErrors:[] ClientInfo:{Debug:false Plugins:[map[Name:buildx Path:/usr/libexec/docker/cli-plugins/docker-buildx SchemaVersion:0.1.0 ShortDescription:Docker Buildx Vendor:Docker Inc. Version:v0.10.2] map[Name:compose Path:/usr/libexec/docker/cli-plugins/docker-compose SchemaVersion:0.1.0 ShortDescription:Docker Compose Vendor:Docker Inc. Version:v2.16.0] map[Name:scan Path:/usr/libexec/docker/cli-plugins/docker-scan SchemaVersion:0.1.0 ShortDescription:Docker Scan Vendor:Docker Inc. Version:v0.23.0]] Warnings:<nil>}}
I0316 15:33:49.430640   25030 out.go:177]

W0316 15:33:49.431870   25030 out.go:239] ❌  Exiting due to DRV_CP_ENDPOINT: failed to lookup ip for ""
❌  Exiting due to DRV_CP_ENDPOINT: failed to lookup ip for ""
W0316 15:33:49.431965   25030 out.go:239] 💡  Suggestion:

    Recreate the cluster by running:
    minikube delete
    minikube start
💡  Suggestion:

    Recreate the cluster by running:
    minikube delete
    minikube start
W0316 15:33:49.431976   25030 out.go:239]

W0316 15:33:49.433123   25030 out.go:239] ╭───────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                           │
│    😿  If the above advice does not help, please let us know:                             │
│    👉  https://github.com/kubernetes/minikube/issues/new/choose                           │
│                                                                                           │
│    Please run `minikube logs --file=logs.txt` and attach logs.txt to the GitHub issue.    │
│    Please also attach the following file to the GitHub issue:                             │
│    - /tmp/minikube_tunnel_9355d93b403d830041bf7d26a54ff1e776fdc191_0.log                  │
│                                                                                           │
╰───────────────────────────────────────────────────────────────────────────────────────────╯
╭───────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                           │
│    😿  If the above advice does not help, please let us know:                             │
│    👉  https://github.com/kubernetes/minikube/issues/new/choose                           │
│                                                                                           │
│    Please run `minikube logs --file=logs.txt` and attach logs.txt to the GitHub issue.    │
│    Please also attach the following file to the GitHub issue:                             │
│    - /tmp/minikube_tunnel_9355d93b403d830041bf7d26a54ff1e776fdc191_0.log                  │
│                                                                                           │
╰───────────────────────────────────────────────────────────────────────────────────────────╯
I0316 15:33:49.434453   25030 out.go:177]