helm / helm

The Kubernetes Package Manager
https://helm.sh
Apache License 2.0
27.03k stars 7.11k forks source link

helm failed to resolve on osx when cgo_enabled false #10874

Closed fragpit closed 2 years ago

fragpit commented 2 years ago

Hi guys, after https://github.com/helm/helm/commit/3490f1e7b6d76709b7ea195370f7db463735f9e2#diff-76ed074a9305c04054cdebb9e9aad2d818052b07091de1f20cad0bbac34ffb52R80

helm stopped working with osx /etc/resolver/* files. It was very useful in case of corporate VPN dns resolving.

https://github.com/golang/go/issues/12524

> helm list --all
Error: Kubernetes cluster unreachable: Get "https://<k8s_server>:8443/k8s/clusters/c-pkq2n/version": dial tcp: lookup <k8s_server> on <external_dns_instead_of_corporate>:53: no such host

Output of helm version: version.BuildInfo{Version:"v3.8.2", GitCommit:"6e3701edea09e5d55a8ca2aae03a68917630e91b", GitTreeState:"clean", GoVersion:"go1.18.1"}

Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"clean", BuildDate:"2022-03-16T15:51:05Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.8", GitCommit:"5575935422cc1cf5169dfc8847cb587aa47bac5a", GitTreeState:"clean", BuildDate:"2021-06-16T12:53:07Z", GoVersion:"go1.15.13", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.23) and server (1.20) exceeds the supported minor version skew of +/-1

Also I've found that os x binaries on https://github.com/helm/helm/releases are built with build-cross make target (as far I can guess) which differ from helm install binary.

rudesome commented 2 years ago

Same issue in my Linux environment. On my Windows host with helm (3.8.2) it is working fine...

I reverted back to version 3.8.1 and am able to connect to my private aks cluster again.

cpboyd commented 2 years ago

I already +1'd the original post, but @joejulian I think this is a bug/regression not simply a question.

While it may be related to #5800, everything works as expected with Helm v3.8.1 on my M1 MacBook. I downgraded to the 3.8.1 Homebrew recipe via:

brew remove helm
brew tap-new $USER/local-helm
brew extract --version 3.8.1 helm $USER/local-helm
brew install helm@3.8.1

The 3.8.2 update broke my corporate VPN DNS resolution. The URLs simply resolve to the wrong IP addresses. If I replace the FQDN with the correct IP address, helm connects but obviously the SSL certificate is signed with the domain name and not the IP so that still fails.

hickeyma commented 2 years ago

@mattfarina Do you mind taking a look?

rudesome commented 2 years ago

Seems HELM only resolves to the nameservers defined in your (local) network settings instead of respecting the VPN connection DNS settings.. am using Twingate btw, not sure if it also happens with other VPN solutions

joejulian commented 2 years ago

This seems to be because the build process switched to building static binaries. Static builds use Go's built-in DNS resolver that does not account for any OS specific configurations.

joejulian commented 2 years ago

IMHO, shipping static binaries is the correct thing for the helm project to do. Dynamically linking should be left to downstream packagers otherwise the release binaries will only work on distros using the same glibc as the build container.

joejulian commented 2 years ago

... and, indeed, it seems to be related to downstream. https://github.com/Homebrew/homebrew-core/blob/b785a45919c2915aa2de3e53390c7a3c4d383a11/Formula/helm.rb#L25 They're not using the gox build process the helm project uses to build its binaries, instead it uses the build make target. I'll bring this up in today's community meeting.

mattfarina commented 2 years ago

I made the change to the make build target so I might as well speak up. To add some context.

When we ship Helm we use a different make target (to use gox and cross compile). That has long had cgo disabled. I realized that make build was enabling cgo because I was working in an environment where cgo didn't work. That is make build failed to build but building with gox worked. Wanted the target we developer use for local builds to mirror what we do when we ship.

I can understand the desire to use cgo in some environments for the networking differences. I had not realized the Mac use case and homebrew.

Two things come to mind...

  1. I'm ok with making if cgo is enabled configurable. With a default to disabled. So that tools like homebrew can build with cgo enabled.
  2. My issue with cgo being enabled was on a Linux distro (sorry, I can't remember which one). Homebrew can be used on Linux. Do we know what using cgo would do there?
cpboyd commented 2 years ago

The main issue needing cgo seems to be related to VPN clients on macOS.

I'm not sure if the same applies to VPNs on Linux.

rudesome commented 2 years ago

I'm not sure if the same applies to VPNs on Linux.

As i mentioned in: https://github.com/helm/helm/issues/10874#issuecomment-1101741929 I have the same issue in my Arch Linux environment, reverting back to 3.8.1 solved my issue with connecting to my private AKS with twingate(VPN).

phumberdroz commented 2 years ago

Hey @cpboyd I discovered this same issue after redoing my macbook somehow I am not able to use your suggestion above.

Maybe I am blind or not but I do not see the issue.

❯ brew extract --version 3.8.1 helm $USER/local-helm

==> Tapping phumberdroz/local-helm
Cloning into '/opt/homebrew/Library/Taps/phumberdroz/homebrew-local-helm'...
Username for 'https://github.com': phumberdroz
Password for 'https://phumberdroz@github.com':
remote: Repository not found.
fatal: repository 'https://github.com/phumberdroz/homebrew-local-helm/' not found
Error: Failure while executing; `git clone https://github.com/phumberdroz/homebrew-local-helm /opt/homebrew/Library/Taps/phumberdroz/homebrew-local-helm --origin=origin --template=` exited with 128.
cpboyd commented 2 years ago

@phumberdroz Ah, I forgot that you have to add the fake tap:

brew tap-new $USER/local-helm

See: https://cmichel.io/how-to-install-an-old-package-version-with-brew/

thesuperzapper commented 2 years ago

@mattfarina @hickeyma @joejulian @phumberdroz so what's our plan to fix DNS resolution on MacOS, given that it's currently broken?

I have tested the https://github.com/helm/helm/releases for 3.8.0+ and they all incorrectly only consider the /etc/resolv.conf nameservers rather than the actual system settings when resolving DNS hosts (note, the actual system DNS settings can be viewed using the scutil --dns command).

Interestingly, as reported by @Rudesome and @cpboyd, versions of helm distributed by brew at or before version 3.8.1 correctly resolve DNS using the system's settings.

joejulian commented 2 years ago

Looks like this is where the fix needs to land. Closing this in favor of the upstream go issue. https://github.com/golang/go/issues/12524

The current workaround is to build your own with CGO enabled on darwin. Hopefully someone can finish up that go PR.

thesuperzapper commented 2 years ago

@joejulian given https://github.com/golang/go/issues/12524 has been open for 7 years, do you think it's safer for us to just distribute helm binaries on homebrew that have been compiled with CGO_ENABLED=1?

So that people are able to use brew install helm and have a working version of helm?

EDIT: for context, this is what kubectl has done see https://github.com/kubernetes/kubernetes/pull/64219

EDIT 2: also, as this issue is 100% still present, can we please reopen this issue to track it

joejulian commented 2 years ago

Last I looked at the homebrew build, they did have CGO enabled. If they no longer do, that needs to be brought up there.

There's no action to be taken in this repo. I volunteer to do issue triage in my spare time and I really need to close some of these no-action issues so I can keep up.

If we were tracking something that required action, I'd agree. If go ever gets hacked to support Darwin's special dns configuration there's nothing for this repo to do but update go - which the maintainers do anyway.

thesuperzapper commented 2 years ago

@joejulian the downstream helm brew formula has never explicitly specified CGO_ENABLED, and only used our default make build.

The problem was actually caused by us forcing CGO_ENABLED=0 in commit https://github.com/helm/helm/commit/3490f1e7b6d76709b7ea195370f7db463735f9e2 (released in helm 3.8.2).

To fix this, we either need to expose a method of setting CGO_ENABLED=1 in our make build or revert CGO_ENABLED to be 1 by default (as it was for helm 3.8.1 and before).

Note that the build for kubectl automatically detects that a build is happening on darwin and enables CGO, we should do the same.