hashicorp / packer-plugin-amazon

Packer plugin for Amazon AMI Builder
https://www.packer.io/docs/builders/amazon
Mozilla Public License 2.0
73 stars 110 forks source link

SSH communicator broken #498

Closed hahuang65 closed 3 months ago

hahuang65 commented 3 months ago

Overview of the Issue

Unable to build AMIs with amazon-ebs.

When using temporary_iam_instance_profile_policy_document It complains about Retryable error: InvalidParameterValue: Value (packer-66b24927-f1bf-3659-0653-2b0e2181a066) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name

Then fails to SSH [DEBUG] TCP connection to SSH ip/port failed: dial tcp: lookup localhost on [::1]:53: no such host

When using iam_instance_profile, it doesn't seem to have an error with setting the instance up, but it still tries to connect to localhost or 0.0.0.0 for SSH.

Reproduction Steps

Just run packer build -var "name=grafana" ami.pkr.hcl with my build file. Change the temporary_iam_instance_profile_policy_document block to iam_instance_profile to test for the other case.

Plugin and Packer version

Packer v1.11.2 packer-plugin-amazon_v1.3.2_x5.0_linux_amd64

Simplified Packer Buildfile

https://gist.github.com/hahuang65/a35654fbf7261a9044e0e98a226ccbe6

Operating system and Environment details

Arch Linux

Log Fragments and crash.log files

https://gist.github.com/hahuang65/6b96e920c0c10b548664370fa0e799f2 (both scenarios are here)

lbajolet-hashicorp commented 3 months ago

Hi @hahuang65,

The connection to localhost is because you specified ssh_interface = "session_manager", this will in turn start aws ssm to open a tunnel to your machine.

The following lines from your logs hint at it:

2024/08/06 11:07:43 packer-plugin-amazon_v1.3.2_x5.0_linux_amd64 plugin: 2024/08/06 11:07:43 Found available port: 8169 on IP: 0.0.0.0
2024/08/06 11:07:43 packer-plugin-amazon_v1.3.2_x5.0_linux_amd64 plugin: 2024/08/06 11:07:43 ssm: Starting PortForwarding session to instance i-02693071c9fb626cb

In this case, you have one aws ssm StartSession process running in the background, relaying ssh connections made on localhost:8169 to your machine's SSH port.

In the logs however, this fails with a lookup issue on [::1]:53, which indicates a DNS issue. Is your local DNS resolver (typically bind or systemd-resolved on Linux) running? Would you be able to check what a command like dig localhost returns?

hahuang65 commented 3 months ago

The connection to localhost is because you specified ssh_interface = "session_manager", this will in turn start aws ssm to open a tunnel to your machine.

Yup, I understand this.

So dig localhost is giving me an NXDomain. systemd-resolved is running... but (and maybe this is a silly question) shouldn't the fact that it's hitting[::1]mean that it's resolvedlocalhostto::1`?

Also, I rebooted my computer and now the error is different. Instead of no such host, I'm getting

[DEBUG] TCP connection to SSH ip/port failed: dial tcp [::1]:8065: connect: connection refused
hahuang65 commented 3 months ago

Ah, yes, that was because I edited my /etc/hosts file. I noticed it was empty, so I added

127.0.0.1       localhost
255.255.255.255 broadcasthost
::1             localhost

which changed my error from no such host to connection refused.

sshd does look like it's running.

hahuang65 commented 3 months ago

Right, so temporary_iam_instance_profile_policy_document still gives me the Retryable error: InvalidParameterValue: Value (packer-66b24927-f1bf-3659-0653-2b0e2181a066) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name error.

I am getting

2024/08/13 11:14:17 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:17 [INFO] Waiting for SSH, up to timeout: 5m0s
==> grafana.amazon-ebs.ami: Waiting for SSH to become available...
2024/08/13 11:14:17 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:17 [DEBUG] TCP connection to SSH ip/port failed: dial tcp [::1]:8503: connect: connection refused
2024/08/13 11:14:17 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:17 Retryable error: TargetNotConnected: i-027c2e698ea22d1fa is not connected.
2024/08/13 11:14:18 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:18 Retryable error: TargetNotConnected: i-027c2e698ea22d1fa is not connected.
2024/08/13 11:14:18 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:18 Retryable error: TargetNotConnected: i-027c2e698ea22d1fa is not connected.
2024/08/13 11:14:19 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:19 Retryable error: TargetNotConnected: i-027c2e698ea22d1fa is not connected.
2024/08/13 11:14:22 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:22 Retryable error: TargetNotConnected: i-027c2e698ea22d1fa is not connected.
2024/08/13 11:14:22 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:22 [DEBUG] TCP connection to SSH ip/port failed: dial tcp [::1]:8503: connect: connection refused
2024/08/13 11:14:25 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:25 Retryable error: TargetNotConnected: i-027c2e698ea22d1fa is not connected.
2024/08/13 11:14:27 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:27 [DEBUG] TCP connection to SSH ip/port failed: dial tcp [::1]:8503: connect: connection refused
2024/08/13 11:14:32 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:32 [DEBUG] TCP connection to SSH ip/port failed: dial tcp [::1]:8503: connect: connection refused

regardless if I use temporary_iam_instance_profile_policy_document or just straight iam_instance_profile.

hahuang65 commented 3 months ago

Looks like this is all user error. Layers of problems:

  1. /etc/hosts didn't have entries for localhost
  2. My nat instance wasn't on

Sorry for the trouble and wasted time/attention.