kubernetes / minikube

Run Kubernetes locally
https://minikube.sigs.k8s.io/
Apache License 2.0
28.88k stars 4.83k forks source link

Not enough debug output in network.go when getting an ip via kvm driver #6850

Closed Loki-Afro closed 3 years ago

Loki-Afro commented 4 years ago

The exact command to reproduce the issue: minikube status --alsologtostderr -v=7

The full output of the command that failed:

zarathustra@anubis ~ % minikube status --alsologtostderr -v=9                                                                                                                                                                           
I0301 16:51:04.649523   48196 main.go:110] libmachine: Found binary path at /home/zarathustra/.minikube/bin/docker-machine-driver-kvm2
I0301 16:51:04.649620   48196 main.go:110] libmachine: Launching plugin server for driver kvm2
I0301 16:51:04.666166   48196 main.go:110] libmachine: Plugin server listening at address 127.0.0.1:43949
I0301 16:51:04.666690   48196 main.go:110] libmachine: () Calling .GetVersion
I0301 16:51:04.667288   48196 main.go:110] libmachine: Using API Version  1
I0301 16:51:04.667343   48196 main.go:110] libmachine: () Calling .SetConfigRaw
I0301 16:51:04.667674   48196 main.go:110] libmachine: () Calling .GetMachineName
I0301 16:51:04.667832   48196 main.go:110] libmachine: (minikube) Calling .GetState
I0301 16:51:04.684115   48196 status.go:141] minikube host status = "Running" (err=<nil>)
I0301 16:51:04.684150   48196 status.go:65] Checking if "minikube" exists ...
I0301 16:51:04.684601   48196 main.go:110] libmachine: Found binary path at /home/zarathustra/.minikube/bin/docker-machine-driver-kvm2
I0301 16:51:04.684647   48196 main.go:110] libmachine: Launching plugin server for driver kvm2
I0301 16:51:04.700902   48196 main.go:110] libmachine: Plugin server listening at address 127.0.0.1:44179
I0301 16:51:04.701480   48196 main.go:110] libmachine: () Calling .GetVersion
I0301 16:51:04.701970   48196 main.go:110] libmachine: Using API Version  1
I0301 16:51:04.701991   48196 main.go:110] libmachine: () Calling .SetConfigRaw
I0301 16:51:04.702326   48196 main.go:110] libmachine: () Calling .GetMachineName
I0301 16:51:04.702478   48196 main.go:110] libmachine: (minikube) Calling .GetIP
E0301 16:51:04.729496   48196 status.go:164] Error host driver ip status: parsing IP: 
E0301 16:51:04.729786   48196 status.go:95] status error: parsing IP: 
host: Running
kubelet: Nonexistent
apiserver: Error
kubeconfig: Nonexistent

The output of the minikube logs command:

zarathustra@anubis ~ % minikube logs

💣  command runner
❌  Error: [SSH_AUTH_FAILURE] getting ssh client for bootstrapper: Error dialing tcp via ssh client: ssh: handshake failed: ssh: unable to authenticate, attempted methods [publickey none], no supported methods remain
💡  Suggestion: Your host is failing to route packets to the minikube VM. If you have VPN software, try turning it off or configuring it so that it does not re-route traffic to the VM IP. If not, check your VM environment routing options.
📘  Documentation: https://minikube.sigs.k8s.io/docs/reference/networking/vpn/
⁉️   Related issues:
    ▪ https://github.com/kubernetes/minikube/issues/3930

The operating system version: Fedora 31

Trying to install minikube on a fresh Fedora 31 via kvm2 driver leads to that minikube is unable to determine a ip for the running vm. As you can see via minikube logs it says it can't connect via ssh .. well .. if there is no ip how could it ;)

I looked into the code and found a few interesting bits

    // Earlier versions of libvirt use a lease file instead of a status file
    if libVersion < 1002006 {
        return d.lookupIPFromLeasesFile()
    }

    // TODO: for everything > 1002006, there is direct support in the libvirt-go for handling this
    return d.lookupIPFromStatusFile(conn)

I guess since this is a fresh fedora installation and it is usually pretty miuch bleeding edge the later code gets called. lookupIPFromStatusFile then calls parseStatusAndReturnIP where I guess lies my problem.

Line 321 AND 335 in netowork.go do NOT return any error but return an empty ip address which fails here https://github.com/kubernetes/minikube/blob/4f1a16385b5b13df3101626fc21d28809913f95f/pkg/drivers/kvm/kvm.go#L202 well fails ... it does check for the error but not the ip.

So my request is to get some more logging into these 2 lines to debug such issues further.

Is there a way to let minikube force to use a specific ip (to skip that weird discovery process) - just for a workaround?

afbjorklund commented 4 years ago

Each VM has two network interfaces, one for the SSH control traffic and one for Kubernetes. So it is trying to determine the IP of the “public” (but still host-only) interface.

What does your /var/lib/libvirt/dnsmasq/virbr?.status look like ? Normally it would look something like this, which is what minikube is trying to parse in this case...

[
    {
        "ip-address": "192.168.39.191",
        "mac-address": "dc:2a:e0:a1:db:8f",
        "hostname": "minikube",
        "client-id": "01:dc:2a:e0:a1:db:8f",
        "expiry-time": 1583092337
    }
]
afbjorklund commented 4 years ago

The error reporting could be improved, but it seems something odd with the new libvirt/dnsmasq

You probably want to check the systemd logs of both services, if there were any dnsmasq issues ?

Machines that are not running, have an empty virbr status:

[

]

Maybe that is what is happening here ? (it would parse as "")

Loki-Afro commented 4 years ago

/var/lib/libvirt/dnsmasq/virbr0.status

[
  {
    "ip-address": "192.168.122.86",
    "mac-address": "ec:ac:5b:47:35:d5",
    "hostname": "minikube",
    "client-id": "01:ec:ac:5b:47:35:d5",
    "expiry-time": 1583101280
  }
]

/var/lib/libvirt/dnsmasq/virbr1.status

[
  {
    "ip-address": "192.168.39.92",
    "mac-address": "b0:ff:7d:f4:d2:97",
    "hostname": "minikube",
    "client-id": "01:b0:ff:7d:f4:d2:97",
    "expiry-time": 1583101280
  }
]

also virsh is reporting that the vm is running

virsh list --all                                                   
 Id   Name       State
--------------------------
 2    minikube   running

I just noticed but duno if its related

root@anubis ~ # journalctl -fu libvirtd                                                                                                                                                                                                
-- Logs begin at Thu 2020-02-06 21:30:30 CET. --
Mar 01 22:26:26 anubis libvirtd[1198]: End of file while reading data: Input/output error
Mar 01 22:26:26 anubis libvirtd[1198]: End of file while reading data: Input/output error
Mar 01 22:26:26 anubis libvirtd[1198]: End of file while reading data: Input/output error
Mar 01 22:26:26 anubis libvirtd[1198]: End of file while reading data: Input/output error

don't know what is causing this issue or if is even related.

seems like when calling minikube ip a new End of file while reading data: Input/output error is thrown

afbjorklund commented 4 years ago

Those status files look perfectly normal, hard to tell why the parsing fails (like you have reported)

Another reason would be if the MAC are not matching, but no real reason why they should not...

tstromberg commented 4 years ago

PR's welcome to add the debugging you need. Send em on over!

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 3 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

fejta-bot commented 3 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

k8s-ci-robot commented 3 years ago

@fejta-bot: Closing this issue.

In response to [this](https://github.com/kubernetes/minikube/issues/6850#issuecomment-691592920): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Send feedback to sig-testing, kubernetes/test-infra and/or [fejta](https://github.com/fejta). >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.