hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.36k stars 9.49k forks source link

Terraform SSH not working with metadata http_endpoints disabled #32754

Closed salecharohit closed 1 year ago

salecharohit commented 1 year ago

Terraform Version

Terraform v1.3.7
on darwin_arm64
+ provider registry.terraform.io/hashicorp/aws v4.54.0
+ provider registry.terraform.io/hashicorp/local v2.1.0
+ provider registry.terraform.io/hashicorp/random v3.1.0
+ provider registry.terraform.io/hashicorp/tls v4.0.4

Terraform Configuration Files

variable "key_name" {
  description = "SSH Key Name For Authentication"
  type        = string
  default     = "ubuntu"
}

resource "tls_private_key" "ubuntu" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "aws_key_pair" "generated_key" {
  key_name   = var.key_name
  public_key = tls_private_key.ubuntu.public_key_openssh

}

data "aws_ami" "ubuntu" {
  most_recent = true

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }

  owners = ["099720109477"] # Canonical
}

resource "aws_instance" "ubuntu" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = var.instance_type
  key_name      = aws_key_pair.generated_key.key_name

  network_interface {
    network_interface_id = aws_network_interface.ubuntu.id
    device_index         = 0
  }

  metadata_options {
    http_endpoint = "disabled"
  }

  connection {
    user        = "ubuntu"
    type        = "ssh"
    host        = self.public_ip
    private_key = tls_private_key.ubuntu.private_key_pem
    timeout     = "1m"
  }

  provisioner "remote-exec" {
    inline = [
      "apt get update -y"
    ]
  }

  depends_on = [
    aws_key_pair.generated_key
  ]
}

Debug Output

https://gist.github.com/salecharohit/c3c7dfb5d024bcdb950b2858c639e555

Expected Behavior

Terraform Apply should work through fine and remote_exec should connect and execute

Actual Behavior

Throws an error as shown which is an SSH error when remote_exec tries to connect.

╷
│ Error: file provisioner error
│
│   with aws_instance.web-template,
│   on ec2.tf line 51, in resource "aws_instance" "web-template":
│   51:   provisioner "file" {
│
│ timeout - last error: SSH authentication failed (ubuntu@35.175.205.156:22): ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported
│ methods remain
╵

However, If I disable the following lines , it all works smoothly, terraform apply works and remote_exec connects and executes the script.

  metadata_options {
    http_endpoint = "disabled"
  }

Additonally, the SSH key generated is unable to connect and throws the same error.

Steps to Reproduce

terraform init terraform apply

Additional Context

I need to build a bastion host with IMDS disabled by default as a security requirement and hence I need to use the following metadata configuration in the aws_instance resource

metadata_options {
    http_endpoint = "disabled"
  }

What I fail to understand is why or rather how is this step/feature interfering with SSH communications ? Why does remote_exec need to contact IMDS service when all it really needs is an SSH private key which is being provided.

References

Other similar issues I looked at prior to filing this error https://github.com/hashicorp/terraform/issues/31146 https://github.com/hashicorp/terraform/issues/27768

jbardin commented 1 year ago

Hi @salecharohit,

The behavior you describe sounds like the instance is disabling external ssh access in conjunction with the http_endpoint being disabled. The output looks like Terraform cannot connect to public_ip, can you verify if that is the case?

salecharohit commented 1 year ago

I guess so , I haven't done an RCA of this , all I know is when I try to disable the http_endpoint the instance just doesn't connect even after supplying a proper SSH key.

jbardin commented 1 year ago

It sounds then like the instance must rely on this http_endpoint in order to use the key provided via key_name on the server side. Terraform makes no use of this information, it only attempts to connect via ssh with the given credentials, and we can see that the credentials are valid. If you are certain this should work and it's a misconfiguration of the instance, I would raise the issue with the AWS provider. If you have more questions, it's would be better to ask in the community forum where there are more people familiar with the provider and AWS services.

Thanks!

salecharohit commented 1 year ago

thanks @jbardin can you help me identify where exactly should I file this ticket ? is there a specific repo for AWS providers ?

jbardin commented 1 year ago

@salecharohit, each provider's GitHub repo is linked from their registry page, the AWS provider's is here: https://registry.terraform.io/providers/hashicorp/aws/latest, and the repo is here https://github.com/hashicorp/terraform-provider-aws/. The linked forums may be more useful too, since the behavior is probably not defined by the provider but rather the remote service. It seems possible that disabling the http metadata endpoint could prevent access to user metadata like the key you are attempting to login with.

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.