ChristianLempa / boilerplates

This is my personal template collection. Here you'll find templates, and configurations for various tools, and technologies.
MIT License
4.54k stars 1.49k forks source link

VM stuck at cloud-init running "init-local" #47

Closed EsharkyTheGreat closed 1 year ago

EsharkyTheGreat commented 2 years ago

I was following your video on creating a ubuntu template with cloud-inti on proxmox using packer but ran into this issue -

image The VM doesn't proceed further than this an the script is still waiting for SSH to be accessible

image

Have you faced this issue or do you possibly know of any fix ? Any help is appreciated

davosian commented 2 years ago

I am facing the same issue for ubuntu focal with exactly the same error messages. Unfortunately, I have not found a solution myself yet. Mainly because I have no idea how to debug this.

davosian commented 2 years ago

If I am trying to use 22.04 jammy, I get stuck at the following screen:

image

In both cases, packer stops at Waiting for SSH to become available... until it eventually times out.

ChristianLempa commented 2 years ago

Can you guys share the logs of your machines and user-data conf?

davosian commented 2 years ago

Absolutely! For user-data, I only changed the user to ubuntu and set the password to ubuntu:

#cloud-config
autoinstall:
  version: 1
  locale: en_US
  keyboard:
    layout: de
  ssh:
    install-server: true
    allow-pw: true
    disable_root: true
    ssh_quiet_keygen: true
    allow_public_ssh_keys: true
  packages:
    - qemu-guest-agent
    - sudo
  storage:
    layout:
      name: direct
    swap:
      size: 0
  user-data:
    package_upgrade: false
    timezone: Europe/Berlin
    users:
      - name: ubuntu
        groups: [adm, sudo]
        lock-passwd: false
        sudo: ALL=(ALL) NOPASSWD:ALL
        shell: /bin/bash
        passwd: $6$exDY1mhS4KUYCE/2$zmn9ToZwTKLhCw.b4/b.ZRTIZM30JZ4QrOQ2aOXJ8yk96xpcCof0kxKwuX1kqLG/ygbJ1f8wxED22bTL4F46P0

How can I send you the server logs? Since I do not get any further, there is no SSH connection set up yet.

EsharkyTheGreat commented 2 years ago

@davosian have you found any fix for it ?

davosian commented 2 years ago

Unfortunately not until now. As as workaround, I ditched Packer and went with cloud based images created with qm instead: https://austinsnerdythings.com/2021/08/30/how-to-create-a-proxmox-ubuntu-cloud-init-image/

I prefer the packer approach but I spent way too much time without success. I tried the packer approach with Ubuntu 20.04, 22.04 and Debian 11. Neither one would work for me so it looks like the issue is a more general one (or I have more than one issue with this approach).

Curious to hear if someone has found a solution.

thegostisdead commented 2 years ago

I've got the same error, I was trying with ssh username and password auth for the ubuntu jamy. I switch to ssh key auth and it works. In user-data

#cloud-config
autoinstall:
  version: 1
  locale: en_US
  keyboard:
    layout: en
  ssh:
    install-server: true
    allow-pw: true
    disable_root: false
    ssh_quiet_keygen: true
    allow_public_ssh_keys: true
  packages:
    - qemu-guest-agent
    - sudo
  storage:
    layout:
      name: direct
    swap:
      size: 0
  user-data:
    package_upgrade: false
    timezone: Europe/Paris
    users:
      - name: ubuntu
        groups: [adm, sudo]
        lock-passwd: false
        sudo: ALL=(ALL) NOPASSWD:ALL
        shell: /bin/bash
        #passwd: test
        # - or -
        ssh_authorized_keys:
         - ssh-rsa AAAAB3Nz.......

In ubuntu-server-jammy-docker.pkr

   ssh_username = "ubuntu"

    # (Option 1) Add your Password here
    #ssh_password = "${var.vm_ssh_password}"
    # - or -
    # (Option 2) Add your Private SSH KEY file here
    ssh_private_key_file = "~/.ssh/id_rsa"

    # Raise the timeout, when installation takes longer
    ssh_timeout = "30m"

I hope that it will fix your issue.

dortlii commented 1 year ago

Hi @EsharkyTheGreat & @davosian

I had the similar issues and it helped me to set the following options:

https://github.com/xcad2k/boilerplates/blob/de7670a9068201b2eb5316ee539ca4446ca10cff/packer/proxmox/ubuntu-server-jammy/ubuntu-server-jammy.pkr.hcl#L90-L93

After settings these options it was a firewall blocking in my home lab. Maybe this helps finding the issue.

davosian commented 1 year ago

Thanks a lot for the tip @dortlii . Will have to give it another shot 👍

andrei-matei commented 1 year ago

Check if you can get a DHCP lease. Also, for me it worked some time ago from my Mac, now, since I got Ventura, even though the HTTP server starts, it seems it does not reach it. I installed a Ubuntu machine in Proxmox and build the images from there.

ChristianLempa commented 1 year ago

Most likely these types of issues are network issues, @EsharkyTheGreat can you comment if this is still an issue?

ChristianLempa commented 1 year ago

Issue gone cold.

cindrmon commented 1 year ago

This is also an issue for me. I don't know how to show the debug logs, but this is all i got from it:

ubuntu-server-focal-test-1.proxmox.ubuntu-server-focal-test-1: output will be in this color.

==> ubuntu-server-focal-test-1.proxmox.ubuntu-server-focal-test-1: Creating VM
==> ubuntu-server-focal-test-1.proxmox.ubuntu-server-focal-test-1: Starting VM
==> ubuntu-server-focal-test-1.proxmox.ubuntu-server-focal-test-1: Starting HTTP server on port 8366
==> ubuntu-server-focal-test-1.proxmox.ubuntu-server-focal-test-1: Waiting 5s for boot
==> ubuntu-server-focal-test-1.proxmox.ubuntu-server-focal-test-1: Typing the boot command
==> ubuntu-server-focal-test-1.proxmox.ubuntu-server-focal-test-1: Waiting for SSH to become available...
==> ubuntu-server-focal-test-1.proxmox.ubuntu-server-focal-test-1: Timeout waiting for SSH.
==> ubuntu-server-focal-test-1.proxmox.ubuntu-server-focal-test-1: Stopping VM
==> ubuntu-server-focal-test-1.proxmox.ubuntu-server-focal-test-1: Deleting VM
Build 'ubuntu-server-focal-test-1.proxmox.ubuntu-server-focal-test-1' errored after 20 minutes 21 seconds: Timeout waiting for SSH.

==> Wait completed after 20 minutes 21 seconds

==> Some builds didn't complete successfully and had errors:
--> ubuntu-server-focal-test-1.proxmox.ubuntu-server-focal-test-1: Timeout waiting for SSH.

==> Builds finished but no artifacts were created.

this is also my current user-data file:

#cloud-config
autoinstall:
  version: 1
  locale: en_US
  keyboard:
    layout: us
  ssh:
    install-server: true
    allow-pw: true
    disable_root: true
    ssh_quiet_keygen: true
    allow_public_ssh_keys: true
  packages:
    - qemu-guest-agent
    - sudo
  storage:
    layout:
      name: direct
    swap:
      size: 0
  user-data:
    package_upgrade: false
    timezone: Asia/Manila
    users:
      - name: cindrmon
        groups: [adm, sudo]
        lock-passwd: false
        sudo: ALL=(ALL) NOPASSWD:ALL
        shell: /bin/bash
        # passwd: your-password
        # - or -
        ssh_authorized_keys:
          - <my_public_key>

and my ubuntu-server-focal.pkr.hcl file:

# Ubuntu Server Focal
# ---
# Packer Template to create an Ubuntu Server (Focal) on Proxmox
# From Christian Lempa: https://github.com/ChristianLempa/boilerplates/blob/main/packer/proxmox/ubuntu-server-focal/ubuntu-server-focal.pkr.hcl

# Variable Definitions
variable "proxmox_api_url" {
    type = string
}

variable "proxmox_api_token_id" {
    type = string
}

variable "proxmox_api_token_secret" {
    type = string
    sensitive = true
}

# Resource Definition for the VM Template
source "proxmox" "ubuntu-server-focal-test-1" {

    # Proxmox Connection Settings
    proxmox_url = var.proxmox_api_url
    username = var.proxmox_api_token_id
    token = var.proxmox_api_token_secret
    insecure_skip_tls_verify = true

    # VM General Settings
    node = "proxmox"
    vm_id = "998"
    vm_name = "ubuntu-server-focal-test-1"
    template_description = "Ubuntu Server Focal Image"

    # VM OS Settings
    # (Local ISO)
    iso_file = "local:iso/ubuntu-20.04.5-live-server-amd64.iso"
    iso_storage_pool = "local"
    # or (Download ISO)
    // iso_file = "https://www.releases.ubuntu.com/focal/ubuntu-20.04.5-live-server-amd64.iso"
    // iso_checksum = "5035be37a7e9abbdc09f0d257f3e33416c1a0fb322ba860d42d74aa75c3468d4"
    unmount_iso = true

    # VM System Settings
    qemu_agent = true

    # VM Hard Disk Settings
    scsi_controller = "virtio-scsi-pci"

    disks {
        disk_size = "10G"
        // format = "qcow2"
        storage_pool = "local-lvm"
        storage_pool_type = "lvm"
        type = "scsi"
    }

    # VM CPU Settings
    cores = "1"

    # VM Memory Settings
    memory = "1024"

    # VM Network Settings
    network_adapters {
        model = "virtio"
        bridge = "vmbr0"
        firewall = false
    }

    # VM Cloud-Init Settings
    cloud_init = true
    cloud_init_storage_pool = "local-lvm"

    # PACKER Boot Commands
    boot_command = [
        "<esc><wait><esc><wait>",
        "<f6><wait><esc><wait>",
        "<bs><bs><bs><bs>",
        "autoinstall ds=nocloud-net;s=http://{{ .HTTPIP }}:{{ .HTTPPort }}/ ",
        "--- <enter>"
    ]
    boot = "c"
    boot_wait = "5s"

    # PACKER AutoInstall Settings
    http_directory = "http"

    ## (Optional) Bind IP Address to Port
    // http_bind_address = "0.0.0.0"
    // http_bind_address = "192.168.122.200"
    // http_port_min = 8802
    // http_port_max = 8802

    ssh_username = "cindrmon"

    ## Option 1) Add Your Own Password
    # ssh_password = ""

    ## or Option 2) Add Private SSH Key
    ssh_private_key_file = "~/.ssh/ubuntuserver_lzh"

    # Raise the timeout, when installation takes longer
    ssh_timeout = "20m"
}

build {

    name = "ubuntu-server-focal-test-1"
    sources = ["source.proxmox.ubuntu-server-focal-test-1"]

    # Provisioning Cloud-Init Integration 
    ## 1) Post-Install
    provisioner "shell" {
        inline = [
            "while [ ! -f /var/lib/cloud/instance/boot-finished ]; do echo 'Waiting for cloud-init...'; sleep 1; done",
            "sudo rm /etc/ssh/ssh_host_*",
            "sudo truncate -s 0 /etc/machine-id",
            "sudo apt -y autoremove --purge",
            "sudo apt -y clean",
            "sudo apt -y autoclean",
            "sudo cloud-init clean",
            "sudo rm -f /etc/cloud/cloud.cfg.d/subiquity-disable-cloudinit-networking.cfg",
            "sudo sync"
        ]
    }

    ## 2) Cleanup
    provisioner "file" {
        source = "files/99-pve.cfg"
        destination = "/tmp/99-pve.cfg"
    }

    ## 3) Preperation
    provisioner "shell" {
        inline = [ "sudo cp /tmp/99-pve.cfg /etc/cloud/cloud.cfg.d/99-pve.cfg" ]
    }
}
spenceradolph commented 1 year ago

I just troubleshooted this issue, and for me it was probably networking. I initially thought I could run the script from my windows desktop and target the proxmox server I have running in my homelab on the same network. However, I was using WSL and I think it NATs a different IP space than what the host uses. I ended up running an LXC debian container on proxmox and installing packer there along with the scripts needed. Seems to be working slightly better at least.