hashicorp / packer

Packer is a tool for creating identical machine images for multiple platforms from a single source configuration.
http://www.packer.io
Other
15.03k stars 3.32k forks source link

ssh timeout `Remote command exited without exit status or exit signal.` #12263

Open spencer-cdw opened 1 year ago

spencer-cdw commented 1 year ago

Community Note

When filing a bug, please include the following headings if possible. Any example text in this template can be deleted.

Overview of the Issue

When building ubuntu 20.04 images on GCP, packer always errors with the following message

2023/02/10 20:07:25 packer-plugin-googlecompute_v1.0.16_x5.0_linux_amd64 plugin: 2023/02/10 20:07:25 [DEBUG] starting remote command: echo 'ubuntu' | sudo -S sh -c 'PACKER_BUILDER_TYPE='googlecompute' PACKER_BUILD_NAME='ubuntu-focal'  /tmp/script_5166.sh'
2023/02/10 20:07:25 packer-plugin-googlecompute_v1.0.16_x5.0_linux_amd64 plugin: 2023/02/10 20:07:25 [ERROR] Remote command exited without exit status or exit signal.
2023/02/10 20:07:25 packer-plugin-googlecompute_v1.0.16_x5.0_linux_amd64 plugin: 2023/02/10 20:07:25 [INFO] RPC endpoint: Communicator ended with: 2300218
2023/02/10 20:07:25 [INFO] 0 bytes written for 'stdout'
2023/02/10 20:07:25 [INFO] 0 bytes written for 'stderr'
2023/02/10 20:07:25 [INFO] RPC client: Communicator ended with: 2300218
2023/02/10 20:07:25 [INFO] RPC endpoint: Communicator ended with: 2300218
2023/02/10 20:07:25 packer-provisioner-shell plugin: [INFO] 0 bytes written for 'stdout'
2023/02/10 20:07:25 packer-provisioner-shell plugin: [INFO] 0 bytes written for 'stderr'
2023/02/10 20:07:25 packer-provisioner-shell plugin: [INFO] RPC client: Communicator ended with: 2300218

Note that this does not happen with ubuntu 18.04 or ubuntu 22.04 images, nor does it happen with 20.04 images on AWS or Azure.

Reproduction Steps

The error messages have been brought up before in the following issues:

Those issues were closed due to being unable to reproduce. The following packer code reproduces the error 100% of the time.

packer {
  required_plugins {
    googlecompute = {
      version = "v1.0.16"
      source  = "github.com/hashicorp/googlecompute"
    }
  }
}

variable "project_id" {
  type    = string
  default = "golden-image-management"
}

source "googlecompute" "ubuntu-focal" {
  project_id             = var.project_id
  source_image           = "ubuntu-2004-focal-v20230125"
  ssh_username           = "packer"
  ssh_read_write_timeout = "5m"
  zone                   = "us-central1-a"
  image_name             = "foobar-ubuntu-20-04-cis1-20230101-01"
}

build {
  name = "ubuntu-focal"
  sources = [
    "source.googlecompute.ubuntu-focal"
  ]

  provisioner "shell" {
    inline          = ["ip a"]
    execute_command = "echo 'ubuntu' | sudo -S sh -c '{{ .Vars }} {{ .Path }}'"
  }

  provisioner "shell" {
    inline            = ["reboot"]
    execute_command   = "echo 'ubuntu' | sudo -S sh -c '{{ .Vars }} {{ .Path }}'"
    expect_disconnect = true
    pause_after       = "120s"
  }

  provisioner "shell" {
    inline          = ["ip a"]
    execute_command = "echo 'ubuntu' | sudo -S sh -c '{{ .Vars }} {{ .Path }}'"
  }
}

Packer version

1.8.5

Log Fragments and crash.log files

https://gist.github.com/spencer-cdw/af6d980d91b2d156f96d1a9c9b1485ac

spencer-cdw commented 1 year ago

Workaround:

force kill sshd before rebooting.

https://github.com/hashicorp/packer/issues/354#issuecomment-23615630

  provisioner "shell" {
    # https://github.com/hashicorp/packer/issues/354#issuecomment-23615630
    inline            = ["ps aux | grep sshd | grep -v grep | awk '{print $2}' | xargs kill && reboot"]
    execute_command   = "echo 'ubuntu' | sudo -S sh -c '{{ .Vars }} {{ .Path }}'"
    expect_disconnect = true
    pause_after       = "120s"
  }
BadJukeBox commented 1 year ago

for what it's worth I have the same issue and the workaround does not work for me. Happens while attempting to reboot on AL2.

cthulhuplus commented 1 year ago

I am experiencing this issue on AWS as well. The provisioning process just randomly hangs and after ~20 minutes I get the same error:

2023/04/11 15:06:15 packer-builder-amazon-ebs plugin: [ERROR] Remote command exited without exit status or exit signal.
2023/04/11 15:06:15 packer-builder-amazon-ebs plugin: [INFO] RPC endpoint: Communicator ended with: 2300218
2023/04/11 15:06:15 [INFO] 15064 bytes written for 'stdout'
2023/04/11 15:06:15 [INFO] 48650 bytes written for 'stderr'
2023/04/11 15:06:15 [INFO] RPC client: Communicator ended with: 2300218
2023/04/11 15:06:15 [INFO] RPC endpoint: Communicator ended with: 2300218
2023/04/11 15:06:15 packer-provisioner-shell plugin: [INFO] 15064 bytes written for 'stdout'
2023/04/11 15:06:15 packer-provisioner-shell plugin: [INFO] 48650 bytes written for 'stderr'
2023/04/11 15:06:15 packer-provisioner-shell plugin: [INFO] RPC client: Communicator ended with: 2300218
2023/04/11 15:06:15 [INFO] (telemetry) ending shell