joshzarrabi / workstation

MIT License
0 stars 3 forks source link

setup.sh fails to install things on VMs #2

Open fg-j opened 4 years ago

fg-j commented 4 years ago

Here's the log output for the weirdness I'm seeing:

➜  workstation git:(master) ✗ ./setup.sh -n ridgewood -p cf-buildpacks -s /tmp/deleteme-acct-key

Initializing the backend...

Initializing provider plugins...
- Using previously-installed hashicorp/google v2.20.3
- Using previously-installed hashicorp/tls v2.2.0

The following providers do not have any version constraints in configuration,
so the latest version was installed.

To prevent automatic upgrades to new major versions that may contain breaking
changes, we recommend adding version constraints in a required_providers block
in your configuration, with the constraint strings suggested below.

* hashicorp/tls: version = "~> 2.2.0"

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
tls_private_key.my-key: Refreshing state... [id=3621403c148ab596d4324d03668960103f71223a]

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # google_compute_address.ip_address will be created
  + resource "google_compute_address" "ip_address" {
      + address            = (known after apply)
      + address_type       = "EXTERNAL"
      + creation_timestamp = (known after apply)
      + id                 = (known after apply)
      + name               = "ridgewood-address"
      + network_tier       = (known after apply)
      + project            = (known after apply)
      + purpose            = (known after apply)
      + region             = (known after apply)
      + self_link          = (known after apply)
      + subnetwork         = (known after apply)
      + users              = (known after apply)
    }

  # google_compute_attached_disk.default will be created
  + resource "google_compute_attached_disk" "default" {
      + device_name = (known after apply)
      + disk        = (known after apply)
      + id          = (known after apply)
      + instance    = (known after apply)
      + mode        = "READ_WRITE"
      + project     = (known after apply)
      + zone        = "us-central1-a"
    }

  # google_compute_disk.default will be created
  + resource "google_compute_disk" "default" {
      + creation_timestamp         = (known after apply)
      + disk_encryption_key_sha256 = (known after apply)
      + id                         = (known after apply)
      + image                      = "ubuntu-os-cloud/ubuntu-1804-lts"
      + label_fingerprint          = (known after apply)
      + last_attach_timestamp      = (known after apply)
      + last_detach_timestamp      = (known after apply)
      + name                       = "ridgewood-disk"
      + physical_block_size_bytes  = 4096
      + project                    = (known after apply)
      + self_link                  = (known after apply)
      + size                       = 1000
      + source_image_id            = (known after apply)
      + source_snapshot_id         = (known after apply)
      + type                       = "pd-ssd"
      + users                      = (known after apply)
      + zone                       = "us-central1-a"
    }

  # google_compute_firewall.external will be created
  + resource "google_compute_firewall" "external" {
      + creation_timestamp = (known after apply)
      + destination_ranges = (known after apply)
      + direction          = (known after apply)
      + id                 = (known after apply)
      + name               = "ridgewood-external"
      + network            = "default"
      + priority           = 1000
      + project            = (known after apply)
      + self_link          = (known after apply)
      + source_ranges      = (known after apply)
      + target_tags        = [
          + "ridgewood",
        ]

      + allow {
          + ports    = [
              + "22",
            ]
          + protocol = "tcp"
        }
      + allow {
          + ports    = []
          + protocol = "icmp"
        }
    }

  # google_compute_instance.default will be created
  + resource "google_compute_instance" "default" {
      + can_ip_forward       = false
      + cpu_platform         = (known after apply)
      + deletion_protection  = false
      + guest_accelerator    = (known after apply)
      + id                   = (known after apply)
      + instance_id          = (known after apply)
      + label_fingerprint    = (known after apply)
      + machine_type         = "n1-standard-8"
      + metadata             = {
          + "ssh-keys" = <<~EOT
                ubuntu:ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCf9z5vuhIMME8clXSYxPcAr0k/fUdo8KicyUMFDl/E7bXjfpbJeyVcS6hdCwAj4bBoERIkJdULIX/y6lWixAY8DgKVTgt+G5OoWQbDWCfAshF1Y+10CY9C7GZcQXEOQCg1ejDqLA1PVN0+JxXlyl8GJE3HYNTuRa0TE08hssfERjVI3wNUENisXOWDqEjWJ/yWdjdQusNKVq6m7B6xXHIlMs82Z6iR1klQk/p7YnwgmdT6dQsSWqIIVt4sKoULV2vufMeffPFEEpRsdnsreRyFvkSuz0QedWhBAdFDKUWsxtE3F7IDAD5ruJ3cE7jXAAgCiqB/KbHiOvAJwvLTQbjM8AxSMwTTnLFuoxyMdrC3MsII5jyuX8N8PMnm0IN9U+RRl6DGaHFLbAoT2AT/9ep3/aslVi+naJHCV011swZp5VvbyxVpg9C5wXteFK7cw82r42Ehq2iestV8aH1cu8WD6X/f7VTnIwXUmkwGdYRuTf6/eo7dm9DZqUfnYc0PuuPZhEPlRN0uVz4YcxLbz2lTGx5U0z2U8TRMg8mNGsMeR9WNqLOx0PmdjOa1ea9FNB/DmVRllNSLhuDjoC4qpecE5WYnWcWrFQ1dmucZDWmIrEK+t2KQrntbje2M8WUk39da0CfhPOWkdFZg29Ypa37/PAVXP4oAn/SVpoMoYXU/vw==
            EOT
        }
      + metadata_fingerprint = (known after apply)
      + name                 = "ridgewood"
      + project              = (known after apply)
      + self_link            = (known after apply)
      + tags                 = [
          + "ridgewood",
        ]
      + tags_fingerprint     = (known after apply)
      + zone                 = "us-central1-a"

      + boot_disk {
          + auto_delete                = true
          + device_name                = (known after apply)
          + disk_encryption_key_sha256 = (known after apply)
          + kms_key_self_link          = (known after apply)
          + mode                       = "READ_WRITE"
          + source                     = (known after apply)

          + initialize_params {
              + image  = "ubuntu-os-cloud/ubuntu-1804-lts"
              + labels = (known after apply)
              + size   = (known after apply)
              + type   = (known after apply)
            }
        }

      + network_interface {
          + address            = (known after apply)
          + name               = (known after apply)
          + network            = "default"
          + network_ip         = (known after apply)
          + subnetwork         = (known after apply)
          + subnetwork_project = (known after apply)

          + access_config {
              + assigned_nat_ip = (known after apply)
              + nat_ip          = (known after apply)
              + network_tier    = (known after apply)
            }
        }

      + scheduling {
          + automatic_restart   = (known after apply)
          + on_host_maintenance = (known after apply)
          + preemptible         = (known after apply)

          + node_affinities {
              + key      = (known after apply)
              + operator = (known after apply)
              + values   = (known after apply)
            }
        }

      + scratch_disk {
          + interface = "SCSI"
        }
    }

Plan: 5 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + vm_ip = (known after apply)

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

google_compute_address.ip_address: Creating...
google_compute_firewall.external: Creating...
google_compute_disk.default: Creating...
google_compute_address.ip_address: Creation complete after 4s [id=cf-buildpacks/us-central1/ridgewood-address]
google_compute_instance.default: Creating...
google_compute_disk.default: Creation complete after 4s [id=ridgewood-disk]
google_compute_firewall.external: Creation complete after 8s [id=ridgewood-external]
google_compute_instance.default: Still creating... [10s elapsed]
google_compute_instance.default: Still creating... [20s elapsed]
google_compute_instance.default: Creation complete after 28s [id=ridgewood]
google_compute_attached_disk.default: Creating...
google_compute_attached_disk.default: Creation complete after 8s [id=ridgewood:ridgewood-disk]

Apply complete! Resources: 5 added, 0 changed, 0 destroyed.

Outputs:

ssh_private_key = <sensitive>
vm_ip = 34.69.74.100
# Host 34.69.74.100 found: line 25
/Users/pivotal/.ssh/known_hosts updated.
Original contents retained as /Users/pivotal/.ssh/known_hosts.old
waiting for vm
.Pseudo-terminal will not be allocated because stdin is not a terminal.
The authenticity of host '34.69.74.100 (34.69.74.100)' can't be established.
ECDSA key fingerprint is SHA256:unrOqyMhGdfp+bncKyB01HepoqDY6bJDXJm+2F3nVJA.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '34.69.74.100' (ECDSA) to the list of known hosts.
Connection closed by 34.69.74.100 port 22
➜  workstation git:(master) ✗ ./ssh.sh -n ridgewood -s /tmp/deleteme-acct-key
Activated service account credentials for: [bal-bbl-service-account@cf-buildpacks.iam.gserviceaccount.com]
Copying gs://cf-buildpacks-workstations/ridgewood/default.tfstate...
/ [1 files][ 18.9 KiB/ 18.9 KiB]
Operation completed over 1 objects/18.9 KiB.
Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-1024-gcp x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

  System information as of Wed Sep  9 12:49:25 UTC 2020

  System load:  0.12               Processes:           180
  Usage of /:   0.1% of 969.00GB   Users logged in:     0
  Memory usage: 1%                 IP address for ens5: 10.128.0.43
  Swap usage:   0%

0 packages can be updated.
0 updates are security updates.

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

ubuntu@ridgewood:~$ ll
total 32
drwxr-xr-x  5 ubuntu ubuntu 4096 Sep  9 12:49 ./
drwxr-xr-x 13 root   root   4096 Sep  9 12:46 ../
-rw-r--r--  1 ubuntu ubuntu  220 Apr  4  2018 .bash_logout
-rw-r--r--  1 ubuntu ubuntu 3771 Apr  4  2018 .bashrc
drwx------  2 ubuntu ubuntu 4096 Sep  9 12:49 .cache/
drwx------  3 ubuntu ubuntu 4096 Sep  9 12:49 .gnupg/
-rw-r--r--  1 ubuntu ubuntu  807 Apr  4  2018 .profile
drwx------  2 ubuntu ubuntu 4096 Sep  9 12:46 .ssh/
ubuntu@ridgewood:~$ exit
joshzarrabi commented 4 years ago

@floragj This looks like a flake to me. I would try running the same command again. It should pick up where it left off an hopefully be more successful on the second run.

joshzarrabi commented 4 years ago

This is a problem with the script where https://github.com/joshzarrabi/workstation/blob/master/setup.sh#L84-L89 fails to properly check if the vm is ready for prime time. We need to do a better job of checking if the vm is ready or add retry logic.

I tried adding a sleep 10 after the vm says its ready but that doesn't seem to have made a difference, which is weird...