hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.37k stars 9.49k forks source link

terraform init/apply kills Macbook Pro M1 Network connection #349 #31467

Open jeff-auth0 opened 2 years ago

jeff-auth0 commented 2 years ago

Problem

Apple M1 Macbook pro network connection dies when run terraform init. I also used [tfenv](https://github.com/tfutils/tfenv) to manage different terraform versions. I tried all different terraform ARM versions from the latest to 1.1.0. Nothing worked.

The only way I could get the network connection back was to restart the laptop.

Temporary workaround

Install the terraform AMD version, and it works.

ARM Version tried

1.2.5, 1.1.0, 1.3.0-alpha

kmoe commented 2 years ago

Thanks for the issue. This is certainly quite concerning, though I'm glad you can use the AMD binaries as a workaround.

We haven't been able to reproduce this issue on M1 laptops, so there could be something about your system that causes this to happen. Perhaps there is some security software installed, a proxy, or some other customisation? Does this problem happen only with Terraform, or with other programs (especially other Go binaries)?

jeff-auth0 commented 2 years ago

Hi @kmoe
Thank you for your quick response. It only happens with terraform. My laptop is brand new and it has same software as other older Macbook pro laptop (Intel based) which does not have any issues with terraform.

crw commented 2 years ago

Hi @jeff-auth0, you might consider posting this in the Community Forum as well as on the Apple Forums. This is unlikely to be a bug in Terraform, as this is the first report in the 19 months since the release of the M1 chipset. However, I will leave this open for now just in case others have this same issue and can offer a reproducible use case.

Out of curiosity, did you use Migration Assistant to go from Intel to M1? I initially used Migration Assistant, which I had been using on Intel Macs since 2007, and it caused bizarre, inexplicable problems. I had to reinstall MacOS from scratch and manually move my data and reinstall my applications. You may not have that issue, that was just my personal experience.

jeff-auth0 commented 2 years ago

Hi folks, After fiddling and searching for a few days, the solution was to disable IPv6 in your macOS system settings and then terraform worked.

Thanks for help Jeff

thebriankeys commented 2 years ago

@jeff-auth0 thanks for sharing this issue and the workaround. I had been stuck on this same issue for several days and disabling IPv6 also resolved it for me. I am also on Macbook Pro M1

ryankearney commented 2 years ago

Can we get this issue opened again? Disabling IPv6 is not a fix. When I run terraform init it completely breaks the system networking. ifconfig hangs in the terminal, opening networking hangs system preferences, and a reboot is required to restore networking.

jeff-auth0 commented 2 years ago

@ryankearney As far as i know this is not terraform issue. Try restarting laptop after disabling IPv6. Just curious how did you disable IPv6 ?

ryankearney commented 2 years ago

@jeff-auth0 I changed "Configure IPv6" to "Link-local only" on my primary/active adapter. Only then did terraform plan (or running VS Code with the HashiCorp Terraform extension installed) not break networking.

I just installed terraform today on a M1 MacBook Pro and had this issue and a Google search led me here. I will reformat my laptop in the future and only install homebrew and terraform to test, but either the terraform binary is breaking networking, or it's triggering something else installed which is causing networking to break.

Some network level apps I have installed are Little Snitch, PaloAlto Global Protect VPN Client, ProtonVPN, and Docker.

jeff-auth0 commented 2 years ago

@ryankearney VPN's and firewall can block terraform connections. As long as you can use terraform init/plan and other commands now then you don't need to format your laptop.

ryankearney commented 2 years ago

@jeff-auth0 I understand how VPNs and firewalls work, thanks though.

Disabling IPv6 is not a solution.

cvsudheer108 commented 2 years ago

I am still facing this issue. If I try: terraform apply or plan, it takes looong time to refresh the state and it works sometimes at the end of the long wait. Manytimes, I had to interrupt the process as it takes too long and also blocks my network connection...

HarryTennent commented 2 years ago

I'm facing this issue on a non M1 mac, and it doesn't happen consistently. For example I managed to plan/apply my terraform code, but on subsequent plans the network crash happens. From what I can tell this may not be directly caused by terraform, but by the AWS provider as the issue happens when I kill that process. Is anyone else who is having this issue also seeing it with the AWS provider, and does killing that process when terraform hangs also cause the crash?

ryankearney commented 2 years ago

@HarryTennent I haven't tested with the AWS provider, but it's happening 100% of the time for me with the Azure provider. Also happens by simply having the official Terraform extension installed in VS Code and I open a .tf file. Whatever the extension is invoking behind the scenes is crashing the network stack as well.

BrainButcher101 commented 2 years ago

Same here on MacBook Pro M2, sometime terraform init kills my network connection and just now even my terminal, and in 100% of the cases terraform plankills my network

crw commented 2 years ago

I am going to re-open this, even though it is highly unlikely that it is an issue with the code of Terraform itself. I tried searching go/golang for relevant issues, but my searching did not reveal anything relevant. If anyone happens across relevant-looking open golang networking issues, please let us know here so we can track it.

ryankearney commented 2 years ago

Thank you @crw

I have 2 M1 MacBook Pros and only one of them exhibits this behavior when connected to the same network.

The one having the issue is due for a clean wipe anyway, so I will use this opportunity to try removing Little Snitch, all the VPN clients, and other things that could interact with the network stack to figure out what terraform might be interacting with that causes this.

beardedsamwise commented 2 years ago

I have the exact same problem on M1 silicon! But it only happens when I tether from my iPhone, I don't have any problems on wireless or ethernet connection at home. As soon as I run terraform plan, destroy or apply my connection drops when tethered. It could be a coincidence but this problem only appeared after installing 12.5.1.

Update: Setting my tethered wireless connection to link local for IPv6 works around the issue for me. Interestingly trying to download the Terraform installer from the Hashicorp website would also hang prior to the workaround...

jweijers commented 2 years ago

I'm also having this issue on a Intel Macbook. Terraform plan/apply freezes the network stack and the macbook becomes pretty much unusable running with latest version of terraform installed through homebrew on macos version 12.5.1 (haven't used terraform before).

Edit: I disabled ipv6 on in macos and the problem is gone, so that works as a temporary workaround. Hope this will be fixed soon

sadok-f commented 2 years ago

Disabling ipv6 on macos solved the problem, I hope this gets fixed soon. Thanks for the workaround. macOS Monterey v12.3.1 Intel.

kmoe commented 2 years ago

The closest relevant issue in the Go repository is https://github.com/golang/go/issues/52839.

Current hypothesis as to the conditions needed to produce this issue:

Please comment if you are able to reproduce this issue without all of the above conditions, e.g. if you are not using IPv6.

At this point I assume the architecture question (AMD vs ARM) is a red herring.

Request for reproductions

If you are able to reproduce this issue with any reliability, I'd be grateful if you can try the following:

  1. git clone git@github.com:hashicorp/terraform.git && cd terraform
  2. CGO_ENABLED=1 go build .
  3. Setting GODEBUG=netdns=cgo+2, use the terraform binary you just built to run the commands that produced the issue. For example: GODEBUG=netdns=cgo+2 terraform plan.

Please comment saying whether this fixes the issue or not and showing the output of Terraform.

ryankearney commented 2 years ago

Great find @kmoe

I can confirm the existence of an IPv6 link-local DNS resolver. Additionally, if I manually configure my resolvers such that no link-local addresses are present (while still maintaining the rest of my machines IPv6 configuration), then I do not exhibit this behavior.

ryankearney commented 2 years ago

@kmoe I am able to reproduce the issue 100% of times I run terraform plan with link-local IPv6 resolvers.

I also followed your steps to build from source.

go version go1.19 darwin/arm64 Terraform v1.3.0-dev on darwin_arm64

I ran terraform plan with this new binary and experienced the same results.

Happy to compile with different flags and other versions of go at your request.

dannyfallon commented 2 years ago

You can try to prefix the terraform command after you build with GODEBUG=netdns=cgo+2 and take the first few lines of debug output and save them before rebooting then post them. The app should explicitly state which DNS resolver it is using and it's the sole way to be sure you're not affected by the aforementioned Go bug

tpolekhin commented 2 years ago

I'm not sure if this is related, but I'm having network issues with terraform-ls https://github.com/hashicorp/terraform-ls/issues/1050 and it all started after upgrading my Intel Mac from Catalina to Monterey couple weeks ago.

I must point that my issue is not that I network dies completely, just every operation with terraform related software is extremely slow or hands until a timeout.

Today I noticed the same issues with terraform itself. I was trying to deploy a simple code on GCP and it was extremely slow on plan and apply, eventually erroring out unable to connect to ipv6 addresses when creating resources. (I logs the log itself unfortunately).

I tried to switch the IPv6 configuration to Link-local, manual, automatic - nothing helps. This issue is present both in the office and at home - 2 different networks.

I've followed the @kmoe post and builded terraform myself with CGO_ENABLED=1 flag - same behaviour

I've tried to run commands with GODEBUG=netdns=cgo+2 and terraform init works pretty fast with some debug output, while terraform plan hangs for couple of minutes before outputting a plan, but no debug information present:

$ GODEBUG=netdns=cgo+2 terraform init
Initializing modules...

Initializing the backend...

Initializing provider plugins...
- Reusing previous version of hashicorp/google-beta from the dependency lock file
go package net: confVal.netCgo = true  netGo = false
go package net: using cgo DNS resolver
go package net: hostLookupOrder(registry.terraform.io) = cgo
go package net: hostLookupOrder(registry.terraform.io) = cgo

UPDATE: tried running terraform from a docker container hashicorp/terraform and init/plan works almost instantly

/tmp # GODEBUG=netdns=cgo+2 terraform init
Initializing modules...

Initializing the backend...

Initializing provider plugins...
- Reusing previous version of hashicorp/google from the dependency lock file
go package net: built with netgo build tag; using Go's DNS resolver
go package net: hostLookupOrder(registry.terraform.io) = files,dns
go package net: hostLookupOrder(registry.terraform.io) = files,dns
- Reusing previous version of hashicorp/google-beta from the dependency lock file
go package net: hostLookupOrder(registry.terraform.io) = files,dns
deanshanahan commented 2 years ago

I'm experiencing this issue too, disabling IPv6 as @jeff-auth0 suggested seems to have resolved it for now.

silvestriluca commented 2 years ago

I'm experiencing this issue too. Very very bad bug. All my network stack crashes and I'm forced to:

Lost some hours and then switching from wi-fi office network (IPv6) to mobile hotspot (IPv4) solved the issue. I've found this Github issue later, so I didn't tried IPv6 disabling workaround (@jeff-auth0), but I can confirm that the scenario is what @kmoe described:

macOS 12.5.1

Will try on my home network later (also IPv6) and see if I can replicate the issue. Please @kmoe this is really disrupting and switching off iPV6 is not a solution for us (we are using IPv6 devices to test with). We'll use Terraform under a Linux environment until this gets fixed.

kmoe commented 1 year ago

Terraform v1.3.1, which will be released later today, may fix this issue. Please re-test when available.

@ryankearney I am particularly concerned with the behaviour you have seen because it seems to disprove the hypothesis in https://github.com/hashicorp/terraform/issues/31467#issuecomment-1236894023. I'd be grateful if you could re-run the command with GODEBUG=netdns=cgo+2 and Terraform v1.3.1, and post the output.

@tpolekhin, this looks to be a separate, though likely related, bug. If Terraform v1.3.1 does not resolve this then please open a separate GitHub issue.

Since none of the Terraform devs have been able to reproduce this, we're currently debugging this one blind, which is not ideal. Thanks for your patience and responses.

sazzer commented 1 year ago

I've been getting this as well, and upgrading to 1.3.1 doesn't seem to have fixed it ☹️ Going to try removing the homebrew version and install the amd64 one instead to see if that works better.

sazzer commented 1 year ago

The amd64 version seems to work fine. However, I've noticed that connecting using my home internet (1GBps, but supports IPv6) is significantly slower than via my phone (4G, but doesn't support IPv6)

pacorreia commented 1 year ago

I'm running terraform on WSL: Linux 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64 GNU/Linux

Terraform version 1.3.1

I've set my DNS resolution preference to be IPv4 rather than IPv6 but without success.

so here goes what happens with me:

tf-dns-issue

AlexMikhalev commented 1 year ago

Bump. Fresh new mac book with m2 and terraform 1.3.2 - terraform apply kills network and the rest of mac os system.

RackerSlam commented 1 year ago

I've been experiencing this for months now, over 2 different M1 Macbooks. Everything typically works fine for a few months then suddenly every time I run a terraform command the network dies, and any app using the network connection has to be force quit, and ultimately my laptop rebooted. Was driving me nuts until I found this post.

Update: Disabling IPv6 fixed the issue

sazzer commented 1 year ago

Terraform 1.3.3 still has this problem. If my WiFi has IPv6 enabled then running terraform plan or terraform apply completely kills the machine. If I disable IPv6 then it all works correctly.

d4rky-pl commented 1 year ago

Unfortunately disabling IPv6 does not work for me on Intel Mac. Running terraform plan eventually kills the network connection (DNS works but trying to open any website results in connection refused until terraform stops working).

crw commented 1 year ago

@d4rky-pl this is likely a different issue that will not be solved via this ticket; I am not aware of an issue open that fits your description. Specifically the impact on Intel macOS, vs Apple Silicon, and the fact that it does not seem to be connected to IPv6. If it persists, please consider opening a new issue. Thanks!

@pacorreia similarly, since this issue is targeting Apple Silicon macOS, I do not know that a fix from this issue will help you in WSL. That said if we do find a root cause and fix for this issue, it may be worth retesting. Thanks!

pacorreia commented 1 year ago

@d4rky-pl this is likely a different issue that will not be solved via this ticket; I am not aware of an issue open that fits your description. Specifically the impact on Intel macOS, vs Apple Silicon, and the fact that it does not seem to be connected to IPv6. If it persists, please consider opening a new issue. Thanks!

@pacorreia similarly, since this issue is targeting Apple Silicon macOS, I do not know that a fix from this issue will help you in WSL. That said if we do find a root cause and fix for this issue, it may be worth retesting. Thanks!

In my case what was discovered was weird, if I point Wsl to solve dns using my home gateway ip, somehow terraform/go dns gives preference to ipv6 record which can't be solved inside WSL.

But when I reverted it to use the host Nat ip address then it started behaving by solving dns using ipv4 record.

really weird

imrehg commented 1 year ago

For what it's worth, I've experienced this issue very strongly on an 2019 Intel Mac. Now I've upgraded to MacOS Ventura 13.0.1, and I no longer have any issues whatsoever(using terraform 1.3.2 currently).

d4rky-pl commented 1 year ago

@crw I'm not sure if it's the different issue or not but for me a workaround was to connect using Wi-Fi tethering to my Android phone which seems to be exact opposite of someone here reported earlier. I'll try to debug this issue more in-depth later and report a new issue if I figure out what's the combination of things that may be causing it (it could still be somehow IPv6 related).

cking94 commented 1 year ago

M1 MacBook Pro (Ventura 13.0.1)

terraform init and targeted (small) terraform plans work fine

Large terraform plans kill my network connectivity, network connectivity is restored after the plan has errored out

Things I've tried:

All with the same outcome

nikie commented 1 year ago

Intel MacBook (Monterey 12.5), disabling IPv6 did not help.

This workaround to list GCP APIs in /etc/hosts has helped: https://github.com/hashicorp/terraform-provider-google/issues/6782#issuecomment-874574409. Here is a Mac version of the script (need to add all the apis used by your Terraform project):

APIS="googleapis.com www.googleapis.com storage.googleapis.com iam.googleapis.com container.googleapis.com cloudresourcemanager.googleapis.com"
for name in $APIS
do
  ipv4=$(dscacheutil -q host -a name "$name" | grep ip_address | head -n 1 | awk '{ print $2 }')
  grep -q "$name" /etc/hosts || ([ -n "$ipv4" ] && sudo sh -c "echo '$ipv4 $name' >> /etc/hosts")
done
mahela-aws commented 1 year ago

Same issue occurred for me on my 2021 Macbook pro M1 Max, only happens when I tethered to my mobile hotspot. After configuring ipv6 to Link-local only on my mobile hotspot this was fixed

d4rky-pl commented 1 year ago

I've switched from Intel to M2 Macbook and the problem still persists. Any news?

crw commented 1 year ago

Checking the upstream issue per https://github.com/hashicorp/terraform/issues/31467#issuecomment-1236894023, I do not see any progress. I'd imagine that would need to be fixed for this issue to move forward.

https://github.com/golang/go/issues/52839

kmoe commented 1 year ago

There is something very odd going on in the Mac network stack that extends beyond Terraform here. Naturally I'd enjoy a clever hack within Terraform that can get around this, but it seems unlikely to surface while https://github.com/golang/go/issues/52839 remains open.

armenr commented 1 year ago

My terraform init/plan/apply operations on the m1 take FOREVER. FOREVER. I didn't even realize it wasn't supposed to be this slow until after I installed terraform into an x86_64 EC2 machine, and saw the difference.

Without exaggeration, I've probably lost WEEKS of my life and productivity over the last year, working with terraform in this kind of half-working, slow state. It's absolutely ABSURD.

None of the suggestions mentioned in this thread are helping, at all, in any way. Compared to the way it runs on my EC2, this is absolutely unusable and unproductive. Frankly, I'm floored.

Flowlance commented 1 year ago

Same issue on a brand new M2 MBP.

RGFuaWVs commented 1 year ago

I encountered the same issue using Terraform v1.2.3 on the darwin_arm64 platform (MacBook Pro M2). Interestingly, the setup first worked for 6 months without any problems, and from one day to another, all outgoing requests for Terraform were slowed down significantly. After a few hours of debugging, the only thing that helped was to switch to the darwin_amd64 version of Terraform, where the issue was instantly resolved.

zeppelinen commented 1 year ago

For me the workaround was to use local DNS server (cloudflared) and pointing system DNS settings to it.

olofspango commented 10 months ago

I'm experiencing this same issues on a Windows 10 PC, running Terraform plan on WSL in a ubuntu-20.04 container.

Terraform plan progresses veeery slowly, and the internet connection on my host also slows down, until my entire host loses its internet connection and the TF plan fails.

danieljarrett74 commented 7 months ago

Has there been any update on this one? I'm still getting this problem in Feb 2024.

Mac OS 14.2.1 M1 Chip AMD version of Terraform 1.7.3 Disabled IPv6 (set to local-link)

None of this seemed to resolve the issue. Does anyone know where it's at or a solid work around other than to use the AMD version or disable IPv6 (which didn't work for me).

To be clear when this crashes my network, it crashes the whole network I'm on. So if anyone else is connected to the same router their internet crashes too. I've also tried this by tethering to my phone and get the same result. Only solution at this point i can think of is switching to windows or maybe using github codespaces. 📦