hashicorp / packer

Packer is a tool for creating identical machine images for multiple platforms from a single source configuration.
http://www.packer.io
Other
15.11k stars 3.33k forks source link

Session Manager SSH hanging on shell provisioner #10584

Closed artis3n closed 3 years ago

artis3n commented 3 years ago

Overview of the Issue

I am attempting to create an AMI using the amazon-ebs builder. My provisioners are an Ansible playbook, a shell that restarts the server, then another shell to verify everything is ok after reboot. When using ssh_interface: session_manager, Packer freezes trying to open a new SSH session for the final shell provisioner (works fine on the first two). I can start a new Session Manager session through the AWS console to the packer builder machine during this period where it hangs locally.

I can change the ssh_interface to public_ip and the AMI build completes in ~31 minutes. The hanging is consistent at the same place when I use session_manager.

This seems materially different than these existing issues with similar-sounding titles - #10424 , #10508

Reproduction Steps

  1. Create Packer file. If it matters, I am using the new HCL2 format.
  2. Run PACKER_LOG=1 packer build wiki.pkr.hcl
  3. Observe the build hangs at the final shell provisioner
==> Personal Wiki.amazon-ebs.wiki: Pausing 1m0s before the next provisioner...
==> Personal Wiki.amazon-ebs.wiki: Provisioning with shell script: /tmp/packer-shell694354012
2021/02/06 20:00:40 packer-provisioner-shell plugin: Opening /tmp/packer-shell694354012 for reading
2021/02/06 20:00:40 packer-provisioner-shell plugin: [INFO] 66 bytes written for 'uploadData'
2021/02/06 20:00:40 [INFO] 66 bytes written for 'uploadData'
2021/02/06 20:00:40 packer-builder-amazon-ebs plugin: [DEBUG] Opening new ssh session

Packer version

➜ packer version              
Packer v1.6.6

Simplified Packer Buildfile

Latest version of the file can be found here. I receive the error with the file included below, in case I've materially changed the file since creating this issue.

Packer HCL2 setup ```hcl source "amazon-ebs" "wiki" { access_key = var.aws_access_key secret_key = var.aws_secret_key ami_description = "Gollum wiki hosted on AWS" ami_name = "${var.ami_name}-${local.timestamp}" ami_virtualization_type = "hvm" iam_instance_profile = var.iam_instance_profile instance_type = var.instance_type[var.architecture] region = var.aws_region ssh_interface = "session_manager" ssh_username = var.ec2_username launch_block_device_mappings { delete_on_termination = true device_name = "/dev/xvda" encrypted = true kms_key_id = var.kms_key_id_or_alias volume_size = var.disk_size volume_type = var.disk_type throughput = var.disk_throughput iops = var.disk_iops } source_ami_filter { filters = { architecture = var.architecture name = "amzn2-ami-hvm*" root-device-type = "ebs" virtualization-type = "hvm" } most_recent = true owners = ["amazon"] } tags = { Base_AMI = "{{ .SourceAMI }}" Base_AMI_Name = "{{ .SourceAMIName }}" } } build { sources = ["source.amazon-ebs.wiki"] name = "Personal Wiki" provisioner "shell" { inline = [ "while [ ! -f /var/lib/cloud/instance/boot-finished ]; do echo 'Waiting for cloud-init...'; sleep 1; done", "echo Beginning to build ${build.ID}", "echo Connected via SSM at '${build.User}@${build.Host}:${build.Port}'" ] } provisioner "shell" { inline = [ "sudo yum update -y", "sudo yum install -y python3 python3-pip python3-wheel python3-setuptools coreutils shadow-utils yum-utils" ] } provisioner "ansible" { galaxy_file = "packer/ansible/requirements.yml" host_alias = "wiki" playbook_file = "packer/ansible/main.yml" user = var.ec2_username ansible_env_vars = ["ANSIBLE_VAULT_PASSWORD_FILE=${var.ansible_vault_pwd_file}"] } provisioner "shell" { inline = ["sudo reboot"] expect_disconnect = true } provisioner "shell" { inline = ["echo ${build.ID} rebooted, done provisioning"] pause_before = "1m" } } # "timestamp" template function replacement locals { timestamp = regex_replace(timestamp(), "[- TZ:]", "") } variable "ec2_username" { type = string description = "The username of the default user on the EC2 instance." default = "ec2-user" } variable "ami_name" { type = string description = "The name of the AMI that gets generated." default = "packer-gollum-wiki" } variable "architecture" { type = string description = "The type of source AMI architecture: either x86_64 or arm64." default = "arm64" } variable "aws_access_key" { type = string description = "AWS_ACCESS_KEY_ID env var." default = env("AWS_ACCESS_KEY_ID") } variable "aws_region" { type = string description = "The AWS region to create the image in. Defaults to us-east-2." default = "us-east-2" } variable "aws_secret_key" { type = string description = "AWS_SECRET_ACCESS_KEY env var." default = env("AWS_SECRET_ACCESS_KEY") sensitive = true } variable "disk_size" { type = number description = "The size of the EBS volume to create." default = 15 } variable "disk_type" { type = string description = "The type of EBS volume to create. Defaults to gp3." default = "gp3" } variable "disk_throughput" { type = number description = "The MB/s of throughput for the EBS volume. For GP3 volumes, this defaults to 125." default = 125 } variable "disk_iops" { type = number description = "The IOPS for the EBS volume. For GP3 volumes, this defaults to 3000." default = 3000 } variable "iam_instance_profile" { type = string default = "AmazonSSMRoleForInstancesQuickSetup" description = "IAM instance profile configured for AWS Session Manager. Defaults to the default AWS role for Session Manager." } variable "instance_type" { type = map(string) description = "The type of EC2 instance to create. Defaults are set for x86_64 and arm64 architectures. Overwrite the one that you want by architecture." default = { "x86_64" : "t3.micro", "arm64" : "t4g.micro" } } variable "kms_key_id_or_alias" { type = string description = "The KMS key ID or alias to encrypt the AMI with. Defaults to the default EBS key alias." default = "alias/aws/ebs" } variable "ansible_vault_pwd_file" { type = string description = "The relative or absolute path to the Ansible Vault password file." default = env("ANSIBLE_VAULT_PASSWORD_FILE") } ```

Operating system and Environment details

➜ cat /etc/lsb-release                 
DISTRIB_ID=Pop
DISTRIB_RELEASE=20.10
DISTRIB_CODENAME=groovy
DISTRIB_DESCRIPTION="Pop!_OS 20.10"

PopOS should be equivalent to Ubuntu.

Log Fragments and crash.log files

2021/02/06 19:59:40 [INFO] (telemetry) Starting provisioner shell
==> Personal Wiki.amazon-ebs.wiki: Pausing 1m0s before the next provisioner...
==> Personal Wiki.amazon-ebs.wiki: Provisioning with shell script: /tmp/packer-shell694354012
2021/02/06 20:00:40 packer-provisioner-shell plugin: Opening /tmp/packer-shell694354012 for reading
2021/02/06 20:00:40 packer-provisioner-shell plugin: [INFO] 66 bytes written for 'uploadData'
2021/02/06 20:00:40 [INFO] 66 bytes written for 'uploadData'
2021/02/06 20:00:40 packer-builder-amazon-ebs plugin: [DEBUG] Opening new ssh session

Once I pressed ctrl+c to end the command execution, I got the following logs. Not sure if that's expected with user interruption or if there are useful nuggets in here.

2021/02/06 20:54:30 packer-provisioner-shell plugin: Received interrupt signal (count: 1). Ignoring.
Cancelling build after receiving interrupt
2021/02/06 20:54:30 packer-provisioner-shell-local plugin: Received interrupt signal (count: 1). Ignoring.
2021/02/06 20:54:30 Cancelling builder after context cancellation context canceled
    Personal Wiki.amazon-ebs.wiki: Terminate signal received, exiting.
2021/02/06 20:54:30 packer-provisioner-shell plugin: Received interrupt signal (count: 1). Ignoring.
2021/02/06 20:54:30 packer-builder-amazon-ebs plugin: Received interrupt signal (count: 1). Ignoring.
2021/02/06 20:54:30 packer-provisioner-ansible plugin: Received interrupt signal (count: 1). Ignoring.
2021/02/06 20:54:30 packer-provisioner-shell plugin: Received interrupt signal (count: 1). Ignoring.
==> Personal Wiki.amazon-ebs.wiki: Terminating the source AWS instance...
2021/02/06 20:54:30 packer-builder-amazon-ebs plugin: [ERROR] ssh session open error: 'ssh: unexpected packet in response to channel open: <nil>', attempting reconnect
2021/02/06 20:54:30 packer-builder-amazon-ebs plugin: [DEBUG] reconnecting to TCP connection for SSH
2021/02/06 20:54:30 packer-builder-amazon-ebs plugin: Cancelling provisioning due to context cancellation: context canceled
2021/02/06 20:54:30 packer-builder-amazon-ebs plugin: Cancelling hook after context cancellation context canceled
    Personal Wiki.amazon-ebs.wiki: Exiting session with sessionId: terraform-0a5c79bb8713db77e.
2021/02/06 20:54:30 Cancelling provisioner after context cancellation context canceled
2021/02/06 20:54:30 packer-builder-amazon-ebs plugin: [DEBUG] handshaking with SSH
    Personal Wiki.amazon-ebs.wiki: Cannot perform start session: write tcp 192.168.1.162:59110->52.95.19.43:443: write: broken pipe
2021/02/06 20:54:30 packer-provisioner-shell plugin: Retryable error: Error uploading script: ssh: handshake failed: read tcp 127.0.0.1:36674->127.0.0.1:8772: read: connection reset by peer
2021/02/06 20:54:30 [INFO] (telemetry) ending shell
==> Personal Wiki.amazon-ebs.wiki: Cleaning up any extra volumes...
==> Personal Wiki.amazon-ebs.wiki: No volumes to clean up, skipping
==> Personal Wiki.amazon-ebs.wiki: Deleting temporary security group...
==> Personal Wiki.amazon-ebs.wiki: Deleting temporary keypair...
2021/02/06 20:55:17 [INFO] (telemetry) ending 
==> Wait completed after 1 hour 22 minutes
2021/02/06 20:55:17 [INFO] (telemetry) Finalizing.
ghost commented 3 years ago

This issue has been automatically migrated to hashicorp/packer-plugin-amazon#28 because it looks like an issue with that plugin. If you believe this is not an issue with the plugin, please reply to hashicorp/packer-plugin-amazon#28.

ghost commented 3 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.