hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.77k stars 9.13k forks source link

[Bug]: EC2 SSH Access not working if http_endpoints is disabled #29829

Open salecharohit opened 1 year ago

salecharohit commented 1 year ago

Terraform Core Version

v1.3.7

AWS Provider Version

4.54.0

Affected Resource(s)

Expected Behavior

Terraform Apply should work through fine and remote_exec should connect and execute

Actual Behavior

Throws an error as shown which is an SSH error when remote_exec tries to connect.

╷
│ Error: file provisioner error
│
│   with aws_instance.web-template,
│   on ec2.tf line 51, in resource "aws_instance" "web-template":
│   51:   provisioner "file" {
│
│ timeout - last error: SSH authentication failed (ubuntu@35.175.205.156:22): ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported
│ methods remain
╵

However, If I disable the following lines , it all works smoothly, terraform apply works and remote_exec connects and executes the script.

  metadata_options {
    http_endpoint = "disabled"
  }

Additonally, the SSH key generated is unable to connect and throws the same error.

Relevant Error/Panic Output Snippet

╷
│ Error: file provisioner error
│
│   with aws_instance.web-template,
│   on ec2.tf line 51, in resource "aws_instance" "web-template":
│   51:   provisioner "file" {
│
│ timeout - last error: SSH authentication failed (ubuntu@35.175.205.156:22): ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported
│ methods remain
╵

Terraform Configuration Files

variable "key_name" {
  description = "SSH Key Name For Authentication"
  type        = string
  default     = "ubuntu"
}

resource "tls_private_key" "ubuntu" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "aws_key_pair" "generated_key" {
  key_name   = var.key_name
  public_key = tls_private_key.ubuntu.public_key_openssh

}

data "aws_ami" "ubuntu" {
  most_recent = true

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }

  owners = ["099720109477"] # Canonical
}

resource "aws_instance" "ubuntu" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = var.instance_type
  key_name      = aws_key_pair.generated_key.key_name

  network_interface {
    network_interface_id = aws_network_interface.ubuntu.id
    device_index         = 0
  }

  metadata_options {
    http_endpoint = "disabled"
  }

  connection {
    user        = "ubuntu"
    type        = "ssh"
    host        = self.public_ip
    private_key = tls_private_key.ubuntu.private_key_pem
    timeout     = "1m"
  }

  provisioner "remote-exec" {
    inline = [
      "apt get update -y"
    ]
  }

  depends_on = [
    aws_key_pair.generated_key
  ]
}

Steps to Reproduce

terraform init terraform apply

Debug Output

https://gist.github.com/salecharohit/c3c7dfb5d024bcdb950b2858c639e555

Panic Output

No response

Important Factoids

I need to build a bastion host with IMDS disabled by default as a security requirement and hence I need to use the following metadata configuration in the aws_instance resource

metadata_options {
    http_endpoint = "disabled"
  }

What I fail to understand is why or rather how is this step/feature interfering with SSH communications ? Why does remote_exec need to contact IMDS service when all it really needs is an SSH private key which is being provided.

References

Other similar issues I looked at prior to filing this error https://github.com/hashicorp/terraform/issues/31146 https://github.com/hashicorp/terraform/issues/27768

https://github.com/hashicorp/terraform/issues/32754 issue was reported here earlier and was asked to redirect

Would you like to implement a fix?

No

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

justinretzolk commented 1 year ago

Hey @salecharohit 👋 Thank you for taking the time to raise this! I took a look around to try to determine what was going on here and stumbled across this comment from a previous, similar issue. As mentioned in that comment, I notice that you're not supplying the security_groups, so perhaps that is the cause of your issue?

I did also note that in a later comment on the same thread, someone mentioned that http_endpoint being set to disabled may cause issues, though I've been unable to find any supporting documentation on the AWS side that would indicate that.

Regardless, this doesn't appear to be a bug with the provider, but rather a configuration issue. We try to keep the Issues section of this repository scoped to bugs and feature requests, and ask that questions be raised in one of the community resources, such as the AWS Provider forum. You may have better luck raising this there. I'll leave this open for now, in case you have any follow up questions before we close this out in favor of one of those resources.

salecharohit commented 1 year ago

Hi @justinretzolk security group has been provided you can view this debug log https://gist.github.com/salecharohit/c3c7dfb5d024bcdb950b2858c639e555 security group is being creted on port 22. As mentioned in the issue , if I comment out

metadata_options {
   http_endpoint = "disabled"
 }

everything works absolutely fine. So the proble is with this specific configuration which is somehow interfering with the SSH authentication.

salecharohit commented 1 year ago

Hey @justinretzolk any update on this ? Do you need more information ?

salecharohit commented 1 year ago

Hey @justinretzolk you can replicate this issue by simply executing this project I created https://github.com/salecharohit/my-cloud-desktop by uncommenting these lines https://github.com/salecharohit/my-cloud-desktop/blob/aba1e2c950961d3b022e1d99a04ed2b5700dc234/ec2.tf#L53 What I fail to understand is how come SSH access is being interefered with http_endpoints being disabled.

salecharohit commented 1 year ago

https://stackoverflow.com/questions/65035324/unable-to-ssh-into-aws-ec2-instance-with-instance-metadata-turned-off there is also a SO question on this same issue.

justinretzolk commented 1 year ago

Hey @salecharohit 👋 Thank you for the additional information! At this point, I believe that we have all of the information that we'll need in order to look into this. Unfortunately I can't provide an ETA on when this will be looked into due to the potential of shifting priorities. We prioritize by count of :+1: reactions and a few other things (more information on our prioritization guide if you're interested).

salecharohit commented 6 months ago

HI @justinretzolk do have a look here https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-IMDS-new-instances.html#configure-IMDS-new-instances--turn-off-instance-metadata one of my friends shared this. It seems you cannot fix this.