derailed / k9s

🐶 Kubernetes CLI To Manage Your Clusters In Style!
https://k9scli.io
Apache License 2.0
26.7k stars 1.67k forks source link

Resolving host names through a VPN on OSX #780

Open olc opened 4 years ago

olc commented 4 years ago




Describe the bug On OSX version 10.15.5 (catalina), I face a problem to reach the k8s servers through a VPN connection.

For some reasons, k9s (v0.20.5) tries to resolve host names on the DNS declared in /etc/resolv.conf rather than asking the local cache resolver for.

❯ cat /etc/resolv.conf
#
# macOS Notice
#
# This file is not consulted for DNS hostname resolution, address
# resolution, or the DNS query routing mechanism used by most
# processes on this system.
#
# To view the DNS configuration used by this system, use:
#   scutil --dns
#
# SEE ALSO
#   dns-sd(1), scutil(8)
#
# This file is automatically generated.
#
search home
nameserver 192.168.1.1

To Reproduce Run k9s -l debug and look at the logs:

4:01PM INF 🐶 K9s starting up...
4:01PM DBG Active Context "rancher"
4:02PM DBG Unable to access servergroups &url.Error{Op:"Get", URL:"https://rancher.in.example.com/k8s/clusters/local/api?timeout=32s", Err:(*net.OpError)(0xc0003f4320)}
4:02PM ERR failed to connect to cluster error="Get \"https://rancher.in.example.com/k8s/clusters/local/api?timeout=32s\": dial tcp: lookup rancher.in.example.com on 192.168.1.1:53: server misbehaving"
4:02PM INF No context specific skin file found -- /Users/olecam/.k9s/rancher_skin.yml

In my context, rancher.in.example.com is only resolvable through the VPN connection (not public).

In other words:

; <<>> DiG 9.10.6 <<>> -4 rancher.in.example.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 49255 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1460 ;; QUESTION SECTION: ;rancher.in.example.com. IN A

;; Query time: 5002 msec ;; SERVER: 192.168.1.1#53(192.168.1.1) ;; WHEN: Tue Jun 23 16:40:06 CEST 2020 ;; MSG SIZE rcvd: 56


Not found, which is normal because the DNS server is not public (192.168.1.1 is my internet box).

**Expected behavior**
`k9s` should queries the local cache resolver.

**Versions (please complete the following information):**
 - OS: OSX 10.15.5
 - K9s 0.20.5

Thanks and keep us the good work!
hawkesn commented 4 years ago

I ran into the same issue. My kubectl was working fine but k9s was not. Thanks for your post @olc I was able to determine that my /etc/resolv.conf was messed up.

ggermis commented 4 years ago

I think this is a go issue, not something specific to K9s. I think this is due to the CGO_ENABLED flag being set to 0 when cross-compiling for OS/X

Reproducing with a small test program (dns.go):

package main

import (
    "os"
    "net"
    "fmt"
)

func main() {
    ips, err := net.LookupIP(os.Args[1])
    if err != nil {
        panic(err)
    }
    fmt.Printf("%+v\n", ips)
}

If I activate my VPN and make sure it pushes a DNS server to resolve hostnames use for a specific domain:

$ scutil --dns
...
resolver #2
  domain   : example.domain
  nameserver[0] : 172.22.0.2
  flags    : Supplemental, Request A records
  reach    : 0x00000002 (Reachable)
  order    : 100406
...

I manually set my /etc/resolv.conf to use google's DNS server so it can't resolve internal hostnames that are available over the VPN

nameserver 8.8.8.8

Now, when I resolve some.example.com through dscacheutil I get the internal IP address

$ dscacheutil -q host -a name some.example.domain
name: some.example.domain
ip_address: 172.25.173.82
ip_address: 172.25.247.167
ip_address: 172.25.84.5

But with the go program, it depends on whether CGO is used or not:

If I run the go program with CGO_ENABLED=0, Go uses its own resolver logic and thus does not use the OS specific lookup mechanism:

$ CGO_ENABLED=0 go run main.go some.example.domain
panic: lookup some.example.domain on 8.8.8.8:53: no such host

and with CGO_ENABLED=1 it returns as expected

$ CGO_ENABLED=1 go run main.go some.example.domain
[172.25.173.82 172.25.247.167 172.25.84.5]

I think this makes sense when cross-compiling since it doesn't necessarily have the libraries to link to for the specific OS at build time

References:

saamalik commented 3 years ago

@ggermis I tried to compile K9s with CGO_ENABLED=1 on a macOS system; but the results are the same. DNS lookup fails with K9s. Did you manually compile K9s with CGO_ENABLED and did that work for you?

ggermis commented 3 years ago

@saamalik just tried it and re-building k9s with cgo seems to work. Instead of setting CGO_ENABLED=1 you will need to change the Makefile so that in the build tag -tags netcgo is used instead of -tags netgo. So the build target from the Makefile should look something like:

build:  ## Builds the CLI
    @go build \
    -ldflags "-w -s -X ${PACKAGE}/cmd.version=${VERSION} -X ${PACKAGE}/cmd.commit=${GIT} -X ${PACKAGE}/cmd.date=${DATE}" \
    -a -tags netcgo -o execs/${NAME} main.go

Where the important part is the -tags netcgo

Then simply build by running make build. This should produce an executable: execs/k9s

Hope this helps

Just to be clear, this is not the setup I currently use. Our k8s clusters run on AWS EC2 instances. We use AWS private hosted zones for DNS resolving in our private VPCs. Whenever we connect to our VPN server (also running in that same VPC), it pushes a default DNS server

eg:

VPC CIDR: 172.22.0.0/16
default DNS server pushed: 172.22.0.2

so all our DNS requests will go over the VPN when connected to it. Since the VPN server is on the VPC CIDR and performs NAT'ing, AWS will return internal addresses for DNS requests

saamalik commented 3 years ago

@ggermis that worked like a charm! Thanks!!

farmerau commented 2 years ago

Seemingly, kubectl is plagued by the same problem with Go, per: https://github.com/kubernetes/kubernetes/issues/23130.

This several year old workaround still works today (I just tried it this morning), but feels a bit wrong and probably side-effect heavy: https://github.com/kubernetes/kubernetes/issues/23130#issuecomment-328312222 .

Of note, it seems that the homebrew build was modified such that CGO_ENABLED=1. ( https://github.com/kubernetes/kubernetes/issues/23130#issuecomment-572292652 ) Is this something we could work to support for brew users?

dee-kryvenko commented 2 years ago

CGO_ENABLED=1 is evil, but unfortunately without it k9s simply not gonna work with any private k8s cluster that is only available behind a corporate VPN. So it's like what, 99% of all clusters in the world?

On a side note, trying to install it from sources with Go 1.18:

> go install github.com/derailed/k9s@v0.25.18
go: downloading github.com/derailed/k9s v0.25.18
go: github.com/derailed/k9s@v0.25.18 (in github.com/derailed/k9s@v0.25.18):
    The go.mod file for the module providing named packages contains one or
    more replace directives. It must not contain directives that would cause
    it to be interpreted differently than if it were the main module.

Too many obstacles, I was trying to reduce the complexity coming from the bloated Lens, not to increase it...

gubtos commented 1 year ago

change build tags from -tags netgo to -tags netcgo worked to me on linux. thanks @ggermis

derailed commented 1 year ago

@ggermis Thank you! Brilliant!!

So it all depends on how k9s is installed on your system. The standard k9s brew formula installs k9s on OSX with cgo disabled. In order to enable cgo and cause native compilation you'll either need to use go install github.com/derailed/k9s@latest or from source make build with @ggermis update (Thank you!).

BTW there is another issue #1895 where cgo needs to be disable of course;(

BenTheElder commented 2 weeks ago

no CGO necessary anymore: https://go-review.git.corp.google.com/c/go/+/446178

farmerau commented 2 weeks ago

no CGO necessary anymore: https://go-review.git.corp.google.com/c/go/+/446178

This looks like a private repo?

BenTheElder commented 2 weeks ago

Sorry, fixed link is: https://golang.org/cl/446178