nix-community / terraform-nixos

A set of Terraform modules that are designed to deploy NixOS [maintainer=@adrian-gierakowski]
Apache License 2.0
333 stars 61 forks source link

Random file provisioner error and SSH authentication failure with AWS EC2 #65

Open spearman opened 2 years ago

spearman commented 2 years ago

Describe the bug When provisioning a new instance, it will sometimes (usually, but not always) fail with a "file provisioner error" with SSH authentication failed

To Reproduce terraform init and terraform apply with the following configuration (main.tf is placed in terraform/main.tf, and .nix files are in nixos/configuration.nix and nixos/git-server.nix: https://gist.github.com/spearman/58db5a31afd88c8962d9a5b3da78ac00

Expected behavior I would expect it to be reproducible and not fail randomly.

Environment

Additional context Here is the full output when running terraform apply:

https://gist.github.com/spearman/5f19ffb4c80791f0444c4a2a3b88afab

This was after it had been successfully deployed and I was trying to change the configuration. Usually when it occurs during creation I can log in as root with the generated .pem file, but the nixos configuration has not been applied.

I thought maybe it was a problem with the particular AMI I was using, but I have experienced the problem with 20.09, 21.05, and 21.11.

malte-christian commented 2 years ago

I experienced this error after deploying OpenSSH 8.8 to a remote instance. It turned out that OpenSSH 8.8 disabled the ssh-rsa key algorithm for security reasons and the terraform provisioner is not working with the newer sha2 algorithms yet (https://github.com/hashicorp/terraform/issues/30134).

As a workaround you can add the following to your system configuration:

services.openssh.extraConfig = ''
   HostkeyAlgorithms +ssh-rsa
   PubkeyAcceptedAlgorithms +ssh-rsa
 '';
blobcode commented 2 years ago

This will soon have a better(?) solution, with hashicorp/terraform-provider-tls#150 hopefully coming out soon, where you could then just switch your keys to ed25519 instead of rsa to avoid this issue altogether.

spearman commented 2 years ago

I'm not sure if this is related, I have been trying to deploy using Gitlab CI and I get the error on the same line, but the last error is an i/o timeout, not an SSH authentication error:

https://gist.github.com/spearman/6c44d4a354a3644d6e75f74c2d98fd91