hashicorp / vagrant

Vagrant is a tool for building and distributing development environments.
https://www.vagrantup.com
Other
26.21k stars 4.43k forks source link

Vagrant destroy hangs #11608

Closed johnsonw closed 4 years ago

johnsonw commented 4 years ago

Vagrant version

Vagrant 2.2.8

Host operating system

CentOS Linux release 7.7.1908 (Core) 3.10.0-1062.9.1.el7.x86_64

Guest operating system

CentOS Linux release 7.7.1908 (Core) 3.10.0-1062.9.1.el7.x86_64

Vagrantfile

# -*- mode: ruby -*-
# vi: set ft=ruby :

# Mem in Mib allocated to each server VM
NODE_MEM = (ENV['NODE_MEM'] || 6144).freeze
# Num CPUs allocated to each server VM
NODE_CPU = (ENV['NODE_CPU'] || 8).freeze

# Mem in Mib allocated to the iSCSI server VM
ISCSI_MEM = (ENV['ISCSI_MEM'] || 1024).freeze
# Num CPUs allocated to the iSCSI server VM
ISCSI_CPU = (ENV['ISCSI_CPU'] || 4).freeze

# User is required (default to root)
# need either password or sshkey (or will assume sshkey)
VBOX_USER   = (ENV['VBOX_USER']   || "root").freeze
VBOX_PASSWD = (ENV['VBOX_PASSWD'] || "").freeze
VBOX_SSHKEY = (ENV['VBOX_SSHKEY'] || "").freeze

# Lustre version
LUSTRE = (ENV['LUSTRE'] || "2.12.4").freeze

REPO_URI = (ENV['REPO_URI'] || '').freeze

require 'open3'
require 'fileutils'

# Create a set of /24 networks under a single /16 subnet range
SUBNET_PREFIX = '10.73'.freeze

# Management network for admin comms
MGMT_NET_PFX = "#{SUBNET_PREFIX}.10".freeze

# Lustre / HPC network
LNET_PFX = "#{SUBNET_PREFIX}.20".freeze

ISCI_IP = "#{SUBNET_PREFIX}.40.10".freeze

ISCI_IP2 = "#{SUBNET_PREFIX}.50.10".freeze

Vagrant.configure('2') do |config|
  config.vm.box = 'centos/7.7'
  config.vm.box_url = 'http://cloud.centos.org/centos/7/vagrant/x86_64/images/CentOS-7-x86_64-Vagrant-2001_01.VirtualBox.box'
  config.vm.box_download_checksum = 'e1a26038fb036ab8e76a6a4dfcd49856'
  config.vm.box_download_checksum_type = 'md5'

  config.vm.provider 'virtualbox' do |vbx|
    vbx.linked_clone = true
    vbx.memory = NODE_MEM
    vbx.cpus = NODE_CPU
    vbx.customize ['modifyvm', :id, '--audio', 'none']
  end

  # Create a basic hosts file for the VMs.
  open('hosts', 'w') do |f|
    f.puts <<-__EOF
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

#{MGMT_NET_PFX}.9 b.local b
#{MGMT_NET_PFX}.10 adm.local adm
#{MGMT_NET_PFX}.11 mds1.local mds1
#{MGMT_NET_PFX}.12 mds2.local mds2
#{MGMT_NET_PFX}.21 oss1.local oss1
#{MGMT_NET_PFX}.22 oss2.local oss2
    __EOF
    (1..8).each do |cidx|
      f.puts "#{MGMT_NET_PFX}.3#{cidx} c#{cidx}.local c#{cidx}\n"
    end
  end

  provision_yum_updates config

  use_vault_7_6_1810 config

  config.vm.provision 'shell', inline: 'cp -f /vagrant/hosts /etc/hosts'

  config.vm.provision 'shell', path: './scripts/disable_selinux.sh'

  system("ssh-keygen -t rsa -m PEM -N '' -f id_rsa") unless File.exist?('id_rsa')

  config.vm.provision 'ssh', type: 'shell', path: './scripts/key_config.sh'

  config.vm.provision 'deps', type: 'shell', inline: <<-SHELL
    yum install -y epel-release
    yum install -y jq htop vim
  SHELL

  config.vm.define 'iscsi' do |iscsi|
    iscsi.vm.hostname = 'iscsi.local'

    iscsi.vm.provider 'virtualbox' do |vbx|
      vbx.memory = ISCSI_MEM
      vbx.cpus = ISCSI_CPU
    end

    iscsi.vm.provision "file",
                       source: "./99-external-storage.rules",
                       destination: "/tmp/99-external-storage.rules"

    iscsi.vm.provision 'udev-trigger', type: 'shell', inline: <<-SHELL
      mv /tmp/99-external-storage.rules /etc/udev/rules.d/ 
      udevadm trigger --subsystem-match=block
    SHELL

    provision_iscsi_net iscsi, '10'

    iscsi.vm.provider 'virtualbox' do |vbx|
      name = get_vm_name('iscsi')
      create_iscsi_disks(vbx, name)
    end

    iscsi.vm.provision 'bootstrap',
                       type: 'shell',
                       path: './scripts/bootstrap_iscsi.sh',
                       args: [ISCI_IP, ISCI_IP2]
  end

  #
  # Create an admin server for the cluster
  #
  config.vm.define 'adm', primary: true do |adm|
    adm.vm.hostname = 'adm.local'

    adm.vm.network 'forwarded_port', guest: 443, host: 8443

    # Admin / management network
    provision_mgmt_net adm, '10'

    provision_fence_agents adm

    create_iml_diagnostics adm

      adm.vm.synced_folder '../',
                           '/integrated-manager-for-lustre/',
                           type: 'rsync',
                           rsync__exclude: [
                              '_topdir',
                              '.cargo/',
                              '.git/',
                              'iml-gui/crate/.cargo/',
                              'iml-gui/crate/target/',
                              'iml-gui/node_modules/',
                              'target/',
                              'vagrant/'
                           ]

    # Install IML onto the admin node
    # Using a given repouri
    adm.vm.provision 'install-iml-repouri',
                     type: 'shell',
                     run: 'never',
                     path: 'scripts/install_iml_repouri.sh',
                     env: {"REPO_URI" => REPO_URI}

    # Install IML onto the admin node
    # Using the mfl devel repo
    adm.vm.provision 'install-iml-devel',
                     type: 'shell',
                     run: 'never',
                     path: 'scripts/install_iml.sh',
                     args: 'https://github.com/whamcloud/integrated-manager-for-lustre/releases/download/6.next/chroma_support.repo'

    # Install IML 5.0 onto the admin node
    # Using the mfl 5.0 repo
    adm.vm.provision 'install-iml-5',
                     type: 'shell',
                     run: 'never',
                     path: 'scripts/install_iml.sh',
                     args: 'https://raw.githubusercontent.com/whamcloud/integrated-manager-for-lustre/v5.0.0/chroma_support.repo'

    # Install IML 5.1 onto the admin node
    # Using the mfl 5.1 repo
    adm.vm.provision 'install-iml-5.1',
                     type: 'shell',
                     run: 'never',
                     path: 'scripts/install_iml.sh',
                     args: 'https://github.com/whamcloud/integrated-manager-for-lustre/releases/download/v5.1.0/chroma_support.repo'

    # Install IML 6.0 onto the admin node
    # Using the mfl 6.0 repo
    adm.vm.provision 'install-iml-6.0',
                     type: 'shell',
                     run: 'never',
                     path: 'scripts/install_iml.sh',
                     args: 'https://github.com/whamcloud/integrated-manager-for-lustre/releases/download/v6.0.0/chroma_support.repo'

    # Install IML 4.0.10.x onto the admin node
    # Using the mfl 4.0.10 copr repo
    adm.vm.provision 'install-iml-4.0.10',
                     type: 'shell',
                     run: 'never',
                     path: 'scripts/install_iml_tar.sh',
                     args: '4.0.10.2'

    # Install IML onto the admin node
    # This requires you have the IML source tree available at
    # /integrated-manager-for-lustre
    adm.vm.provision 'install-iml-local',
                     type: 'shell',
                     run: 'never',
                     path: 'scripts/install_iml_local.sh'

    adm.vm.provision 'deploy-managed-hosts',
                     type: 'shell',
                     run: 'never',
                     path: 'scripts/deploy_hosts.sh',
                     args: 'base_managed_patchless'

    adm.vm.provision 'load-diagnostics-db',
                     type: 'shell',
                     run: 'never',
                     path: 'scripts/load-diagnostics-db.sh'
  end

  #
  # Create the metadata servers (HA pair)
  #
  (1..2).each do |i|
    config.vm.define "mds#{i}" do |mds|
      mds.vm.hostname = "mds#{i}.local"

      mds.vm.provider 'virtualbox' do |vbx|
        vbx.name = "mds#{i}"
      end

      create_iml_diagnostics mds

      provision_lnet_net mds, "1#{i}"

      # Admin / management network
      provision_mgmt_net mds, "1#{i}"

      provision_iscsi_net mds, "1#{i}"

      # Private network to simulate crossover.
      # Used exclusively as additional cluster network
      mds.vm.network 'private_network',
                     ip: "#{SUBNET_PREFIX}.230.1#{i}",
                     netmask: '255.255.255.0',
                     auto_config: false,
                     virtualbox__intnet: 'crossover-net-mds'

      provision_iscsi_client mds, 'mds', i

      provision_mpath mds

      provision_fence_agents mds

      cleanup_storage_server mds

      mds.vm.provision 'install-iml-local',
                       type: 'shell',
                       run: 'never',
                       path: './scripts/install_iml_local_agent.sh'

      install_lustre_zfs mds

      install_lustre_ldiskfs mds

      install_zfs_no_iml mds

      install_ldiskfs_no_iml mds

      configure_lustre_network mds

      configure_docker_network mds

      configure_ntp mds

      configure_ntp_docker mds

      wait_for_ntp mds

      wait_for_ntp_docker mds

      if i == 1
        mds.vm.provision 'create-pools',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           genhostid
                           zpool create mgt -o multihost=on /dev/mapper/mpatha
                           zpool create mdt0 -o multihost=on /dev/mapper/mpathb
                         SHELL

        mds.vm.provision 'import-pools',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           zpool import mgt
                           zpool import mdt0
                         SHELL

        mds.vm.provision 'zfs-params',
                         type: 'shell',
                         run: 'never',
                         path: './scripts/zfs_params.sh'

        mds.vm.provision 'create-zfs-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           mkfs.lustre --servicenode 10.73.20.11@tcp:10.73.20.12@tcp --mgs --backfstype=zfs mgt/mgt
                           mkdir -p /lustre/zfsmo/mgs
                           mount -t lustre mgt/mgt /lustre/zfsmo/mgs

                           mkfs.lustre --reformat --failover 10.73.20.12@tcp --mdt --backfstype=zfs --fsname=zfsmo --index=0 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp mdt0/mdt0
                           mkdir -p /lustre/zfsmo/mdt0
                           mount -t lustre mdt0/mdt0 /lustre/zfsmo/mdt0
                         SHELL

        mds.vm.provision 'create-ldiskfs-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                              mkfs.lustre --mgs --reformat --servicenode=10.73.20.11@tcp --servicenode=10.73.20.12@tcp /dev/mapper/mpatha
                              mkfs.lustre --mdt --reformat --servicenode=10.73.20.11@tcp --servicenode=10.73.20.12@tcp --index=0 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp --fsname=fs /dev/mapper/mpathb
                         SHELL

        mds.vm.provision 'mount-ldiskfs-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           mkdir -p /mnt/mgs
                           mkdir -p /mnt/mdt0
                           mount -t lustre /dev/mapper/mpatha /mnt/mgs
                           mount -t lustre /dev/mapper/mpathb /mnt/mdt0
                         SHELL

        mds.vm.provision 'create-ldiskfs-lvm-fs',
                         type: 'shell',
                         run: 'never',
                         path: 'scripts/create_ldiskfs_lvm_fs.sh',
                         args: ['fs', '/dev/mapper/mpathb', 0, '/dev/mapper/mpatha']

        mds.vm.provision 'mount-ldiskfs-lvm-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           mkdir -p /mnt/mgs
                           mkdir -p /mnt/mdt0
                           mount -t lustre /dev/mapper/mgt_vg-mgt /mnt/mgs
                           mount -t lustre /dev/mapper/mdt0_vg-mdt /mnt/mdt0
                         SHELL

         mds.vm.provision 'ha-ldiskfs-lvm-fs-setup',
                         type: 'shell',
                         run: 'never',
                         path: 'scripts/create_ldiskfs_lvm_mds_ha_setup.sh',
                         args: [ VBOX_USER, VBOX_PASSWD, VBOX_SSHKEY ]
      else
        mds.vm.provision 'create-pools',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           genhostid
                           zpool create mdt1 -o multihost=on /dev/mapper/mpathc
                         SHELL

        mds.vm.provision 'import-pools',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           zpool import mdt1
                         SHELL

        mds.vm.provision 'zfs-params',
                         type: 'shell',
                         run: 'never',
                         path: './scripts/zfs_params.sh'

        mds.vm.provision 'create-zfs-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           mkfs.lustre --failover 10.73.20.11@tcp --mdt --backfstype=zfs --fsname=zfsmo --index=1 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp mdt1/mdt1
                           mkdir -p /lustre/zfsmo/mdt1
                           mount -t lustre mdt1/mdt1 /lustre/zfsmo/mdt1
                         SHELL

        mds.vm.provision 'create-ldiskfs-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                            mkfs.lustre --mdt --reformat --servicenode=10.73.20.11@tcp --servicenode=10.73.20.12@tcp --index=1 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp --fsname=fs /dev/mapper/mpathc
                         SHELL

        mds.vm.provision 'create-ldiskfs-fs2',
                            type: 'shell',
                            run: 'never',
                            inline: <<-SHELL
                              mkfs.lustre --mdt --reformat --servicenode=10.73.20.11@tcp --servicenode=10.73.20.12@tcp --index=0 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp --fsname=fs2 /dev/mapper/mpathd
                            SHELL

        mds.vm.provision 'mount-ldiskfs-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           mkdir -p /mnt/mdt1
                           mount -t lustre /dev/mapper/mpathc /mnt/mdt1
                         SHELL

        mds.vm.provision 'mount-ldiskfs-fs2',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           mkdir -p /mnt/mdt2
                           mount -t lustre /dev/mapper/mpathd /mnt/mdt2
                         SHELL

        mds.vm.provision 'create-ldiskfs-lvm-fs',
                         type: 'shell',
                         run: 'never',
                         path: 'scripts/create_ldiskfs_lvm_fs.sh',
                         args: ['fs', '/dev/mapper/mpathc', 1, '']

        mds.vm.provision 'mount-ldiskfs-lvm-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           mkdir -p /mnt/mdt1
                           mount -t lustre /dev/mapper/mdt1_vg-mdt /mnt/mdt1
                         SHELL

      end

      mds.vm.provision 'ha-ldiskfs-lvm-fs-prep',
                       type: 'shell',
                       run: 'never',
                       inline: <<-SHELL
                         yum -y  --nogpgcheck install pcs lustre-resource-agents
                         echo -n lustre | passwd --stdin hacluster
                         systemctl enable --now pcsd
                         mkdir -p /mnt/mgs
                         mkdir -p /mnt/mdt{0,1}
                       SHELL

    end
  end

  #
  # Create the object storage servers (OSS)
  # Servers are configured in HA pairs
  #
  (1..2).each do |i|
    config.vm.define "oss#{i}",
                     autostart: i <= 2 do |oss|

      oss.vm.hostname = "oss#{i}.local"

      oss.vm.provider 'virtualbox' do |vbx|
        vbx.name = "oss#{i}"
      end

      create_iml_diagnostics oss

      # Lustre / application network
      provision_lnet_net oss, "2#{i}"

      # Admin / management network
      provision_mgmt_net oss, "2#{i}"

      provision_iscsi_net oss, "2#{i}"

      # Private network to simulate crossover.
      # Used exclusively as additional cluster network
      oss.vm.network 'private_network',
                     ip: "#{SUBNET_PREFIX}.231.2#{i}",
                     netmask: '255.255.255.0',
                     auto_config: false,
                     virtualbox__intnet: 'crossover-net-oss'

      provision_iscsi_client oss, 'oss', i

      provision_mpath oss

      provision_fence_agents oss

      cleanup_storage_server oss

      oss.vm.provision 'install-iml-local',
            type: 'shell',
            run: 'never',
            path: './scripts/install_iml_local_agent.sh'

      install_lustre_zfs oss

      install_lustre_ldiskfs oss

      install_ldiskfs_no_iml oss

      install_zfs_no_iml oss

      configure_lustre_network oss

      configure_docker_network oss

      configure_ntp oss

      configure_ntp_docker oss

      wait_for_ntp oss

      wait_for_ntp_docker oss

      if i == 1
        oss.vm.provision 'create-pools',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           genhostid
                           zpool create ost0 -o multihost=on /dev/mapper/mpatha
                           zpool create ost1 -o multihost=on /dev/mapper/mpathb
                           zpool create ost2 -o multihost=on /dev/mapper/mpathc
                           zpool create ost3 -o multihost=on /dev/mapper/mpathd
                           zpool create ost4 -o multihost=on /dev/mapper/mpathe
                           zpool create ost5 -o multihost=on /dev/mapper/mpathf
                           zpool create ost6 -o multihost=on /dev/mapper/mpathg
                           zpool create ost7 -o multihost=on /dev/mapper/mpathh
                           zpool create ost8 -o multihost=on /dev/mapper/mpathi
                           zpool create ost9 -o multihost=on /dev/mapper/mpathj
                         SHELL

        oss.vm.provision 'import-pools',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           zpool import ost0
                           zpool import ost1
                           zpool import ost2
                           zpool import ost3
                           zpool import ost4
                           zpool import ost5
                           zpool import ost6
                           zpool import ost7
                           zpool import ost8
                           zpool import ost9
                         SHELL

        oss.vm.provision 'zfs-params',
                         type: 'shell',
                         run: 'never',
                         path: './scripts/zfs_params.sh'

        oss.vm.provision 'create-zfs-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                              mkfs.lustre --failover 10.73.20.22@tcp --ost --backfstype=zfs --fsname=zfsmo --index=0 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost0/ost0
                              mkfs.lustre --failover 10.73.20.22@tcp --ost --backfstype=zfs --fsname=zfsmo --index=1 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost1/ost1
                              mkfs.lustre --failover 10.73.20.22@tcp --ost --backfstype=zfs --fsname=zfsmo --index=2 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost2/ost2
                              mkfs.lustre --failover 10.73.20.22@tcp --ost --backfstype=zfs --fsname=zfsmo --index=3 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost3/ost3
                              mkfs.lustre --failover 10.73.20.22@tcp --ost --backfstype=zfs --fsname=zfsmo --index=4 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost4/ost4
                              mkfs.lustre --failover 10.73.20.22@tcp --ost --backfstype=zfs --fsname=zfsmo --index=5 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost5/ost5
                              mkfs.lustre --failover 10.73.20.22@tcp --ost --backfstype=zfs --fsname=zfsmo --index=6 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost6/ost6
                              mkfs.lustre --failover 10.73.20.22@tcp --ost --backfstype=zfs --fsname=zfsmo --index=7 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost7/ost7
                              mkfs.lustre --failover 10.73.20.22@tcp --ost --backfstype=zfs --fsname=zfsmo --index=8 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost8/ost8
                              mkfs.lustre --failover 10.73.20.22@tcp --ost --backfstype=zfs --fsname=zfsmo --index=9 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost9/ost9
                              mkdir -p /lustre/zfsmo/ost{0..9}
                              mount -t lustre ost0/ost0 /lustre/zfsmo/ost0
                              mount -t lustre ost1/ost1 /lustre/zfsmo/ost1
                              mount -t lustre ost2/ost2 /lustre/zfsmo/ost2
                              mount -t lustre ost3/ost3 /lustre/zfsmo/ost3
                              mount -t lustre ost4/ost4 /lustre/zfsmo/ost4
                              mount -t lustre ost5/ost5 /lustre/zfsmo/ost5
                              mount -t lustre ost6/ost6 /lustre/zfsmo/ost6
                              mount -t lustre ost7/ost7 /lustre/zfsmo/ost7
                              mount -t lustre ost8/ost8 /lustre/zfsmo/ost8
                              mount -t lustre ost9/ost9 /lustre/zfsmo/ost9
                         SHELL

        oss.vm.provision 'create-ldiskfs-fs',
                         type: 'shell',
                         run: 'never',
                         path: 'scripts/create_ldiskfs_fs_osts.sh',
                         args: ['a', 'e', 0, 'fs']

        oss.vm.provision 'mount-ldiskfs-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                          mkdir -p /mnt/ost{0,1,2,3,4}
                          mount -t lustre /dev/mapper/mpatha /mnt/ost0
                          mount -t lustre /dev/mapper/mpathb /mnt/ost1
                          mount -t lustre /dev/mapper/mpathc /mnt/ost2
                          mount -t lustre /dev/mapper/mpathd /mnt/ost3
                          mount -t lustre /dev/mapper/mpathe /mnt/ost4
                         SHELL

        oss.vm.provision 'create-ldiskfs-fs2',
                         type: 'shell',
                         run: 'never',
                         path: 'scripts/create_ldiskfs_fs_osts.sh',
                         args: ['f', 'j', 0, 'fs2']

        oss.vm.provision 'mount-ldiskfs-fs2',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                          mkdir -p /mnt/ost2-{0,1,2,3,4}
                          mount -t lustre /dev/mapper/mpathf /mnt/ost2-0
                          mount -t lustre /dev/mapper/mpathg /mnt/ost2-1
                          mount -t lustre /dev/mapper/mpathh /mnt/ost2-2
                          mount -t lustre /dev/mapper/mpathi /mnt/ost2-3
                          mount -t lustre /dev/mapper/mpathj /mnt/ost2-4
                         SHELL

        oss.vm.provision 'create-ldiskfs-lvm-fs',
                         type: 'shell',
                         run: 'never',
                         path: 'scripts/create_ldiskfs_fs_osts.sh',
                         args: ['a', 'e', 0, 'fs']

        oss.vm.provision 'mount-ldiskfs-lvm-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                          mkdir -p /mnt/ost{0,1,2,3,4}
                          mount -t lustre /dev/mapper/mpatha /mnt/ost0
                          mount -t lustre /dev/mapper/mpathb /mnt/ost1
                          mount -t lustre /dev/mapper/mpathc /mnt/ost2
                          mount -t lustre /dev/mapper/mpathd /mnt/ost3
                          mount -t lustre /dev/mapper/mpathe /mnt/ost4
                         SHELL

        oss.vm.provision 'ha-ldiskfs-lvm-fs-setup',
                         type: 'shell',
                         run: 'never',
                         path: 'scripts/create_ldiskfs_lvm_oss_ha_setup.sh',
                         args: [ "{a..e} {k..o}", 0, VBOX_USER, VBOX_PASSWD, VBOX_SSHKEY ]

      else
        oss.vm.provision 'create-pools',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           genhostid
                           zpool create ost10 -o multihost=on /dev/mapper/mpathk
                           zpool create ost11 -o multihost=on /dev/mapper/mpathl
                           zpool create ost12 -o multihost=on /dev/mapper/mpathm
                           zpool create ost13 -o multihost=on /dev/mapper/mpathn
                           zpool create ost14 -o multihost=on /dev/mapper/mpatho
                           zpool create ost15 -o multihost=on /dev/mapper/mpathp
                           zpool create ost16 -o multihost=on /dev/mapper/mpathq
                           zpool create ost17 -o multihost=on /dev/mapper/mpathr
                           zpool create ost18 -o multihost=on /dev/mapper/mpaths
                           zpool create ost19 -o multihost=on /dev/mapper/mpatht
                         SHELL

        oss.vm.provision 'import-pools',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                           zpool import ost10
                           zpool import ost11
                           zpool import ost12
                           zpool import ost13
                           zpool import ost14
                           zpool import ost15
                           zpool import ost16
                           zpool import ost17
                           zpool import ost18
                           zpool import ost19
                         SHELL

        oss.vm.provision 'zfs-params',
                         type: 'shell',
                         run: 'never',
                         path: './scripts/zfs_params.sh'

        oss.vm.provision 'create-zfs-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                              mkfs.lustre --failover 10.73.20.21@tcp --ost --backfstype=zfs --fsname=zfsmo --index=10 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost10/ost10
                              mkfs.lustre --failover 10.73.20.21@tcp --ost --backfstype=zfs --fsname=zfsmo --index=11 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost11/ost11
                              mkfs.lustre --failover 10.73.20.21@tcp --ost --backfstype=zfs --fsname=zfsmo --index=12 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost12/ost12
                              mkfs.lustre --failover 10.73.20.21@tcp --ost --backfstype=zfs --fsname=zfsmo --index=13 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost13/ost13
                              mkfs.lustre --failover 10.73.20.21@tcp --ost --backfstype=zfs --fsname=zfsmo --index=14 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost14/ost14
                              mkfs.lustre --failover 10.73.20.21@tcp --ost --backfstype=zfs --fsname=zfsmo --index=15 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost15/ost15
                              mkfs.lustre --failover 10.73.20.21@tcp --ost --backfstype=zfs --fsname=zfsmo --index=16 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost16/ost16
                              mkfs.lustre --failover 10.73.20.21@tcp --ost --backfstype=zfs --fsname=zfsmo --index=17 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost17/ost17
                              mkfs.lustre --failover 10.73.20.21@tcp --ost --backfstype=zfs --fsname=zfsmo --index=18 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost18/ost18
                              mkfs.lustre --failover 10.73.20.21@tcp --ost --backfstype=zfs --fsname=zfsmo --index=19 --mgsnode=10.73.20.11@tcp:10.73.20.12@tcp ost19/ost19
                              mkdir -p /lustre/zfsmo/ost{10..19}
                              mount -t lustre ost10/ost10 /lustre/zfsmo/ost10
                              mount -t lustre ost11/ost11 /lustre/zfsmo/ost11
                              mount -t lustre ost12/ost12 /lustre/zfsmo/ost12
                              mount -t lustre ost13/ost13 /lustre/zfsmo/ost13
                              mount -t lustre ost14/ost14 /lustre/zfsmo/ost14
                              mount -t lustre ost15/ost15 /lustre/zfsmo/ost15
                              mount -t lustre ost16/ost16 /lustre/zfsmo/ost16
                              mount -t lustre ost17/ost17 /lustre/zfsmo/ost17
                              mount -t lustre ost18/ost18 /lustre/zfsmo/ost18
                              mount -t lustre ost19/ost19 /lustre/zfsmo/ost19
                         SHELL

        oss.vm.provision 'create-ldiskfs-fs',
                         type: 'shell',
                         run: 'never',
                         path: 'scripts/create_ldiskfs_fs_osts.sh',
                         args: ['k', 'o', 5, 'fs']

        oss.vm.provision 'mount-ldiskfs-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                             mkdir -p /mnt/ost{5,6,7,8,9}
                             mount -t lustre /dev/mapper/mpathk /mnt/ost5
                             mount -t lustre /dev/mapper/mpathl /mnt/ost6
                             mount -t lustre /dev/mapper/mpathm /mnt/ost7
                             mount -t lustre /dev/mapper/mpathn /mnt/ost8
                             mount -t lustre /dev/mapper/mpatho /mnt/ost9
                         SHELL

        oss.vm.provision 'create-ldiskfs-fs2',
                         type: 'shell',
                         run: 'never',
                         path: 'scripts/create_ldiskfs_fs_osts.sh',
                         args: ['p', 't', 5, 'fs2']

        oss.vm.provision 'mount-ldiskfs-fs2',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                             mkdir -p /mnt/ost2-{5,6,7,8,9}
                             mount -t lustre /dev/mapper/mpathp /mnt/ost2-5
                             mount -t lustre /dev/mapper/mpathq /mnt/ost2-6
                             mount -t lustre /dev/mapper/mpathr /mnt/ost2-7
                             mount -t lustre /dev/mapper/mpaths /mnt/ost2-8
                             mount -t lustre /dev/mapper/mpatht /mnt/ost2-9
                         SHELL

        oss.vm.provision 'create-ldiskfs-lvm-fs',
                         type: 'shell',
                         run: 'never',
                         path: 'scripts/create_ldiskfs_fs_osts.sh',
                         args: ['k', 'o', 5, 'fs']

        oss.vm.provision 'mount-ldiskfs-lvm-fs',
                         type: 'shell',
                         run: 'never',
                         inline: <<-SHELL
                             mkdir -p /mnt/ost{5,6,7,8,9}
                             mount -t lustre /dev/mapper/mpathk /mnt/ost5
                             mount -t lustre /dev/mapper/mpathl /mnt/ost6
                             mount -t lustre /dev/mapper/mpathm /mnt/ost7
                             mount -t lustre /dev/mapper/mpathn /mnt/ost8
                             mount -t lustre /dev/mapper/mpatho /mnt/ost9
                         SHELL

      end

      oss.vm.provision 'ha-ldiskfs-lvm-fs-prep',
                       type: 'shell',
                       run: 'never',
                       inline: <<-SHELL
                         yum -y --nogpgcheck install pcs lustre-resource-agents
                         echo -n lustre | passwd --stdin hacluster
                         systemctl enable --now pcsd
                         mkdir -p /mnt/ost{0..9}
                       SHELL

    end
  end

  # Create a set of compute nodes.
  # By default, only 2 compute nodes are created.
  # The configuration supports a maximum of 8 compute nodes.
  (1..8).each do |i|
    config.vm.define "c#{i}",
                     autostart: i <= 2 do |c|
      c.vm.hostname = "c#{i}.local"

      # Admin / management network
      provision_mgmt_net c, "3#{i}"

      # Lustre / application network
      provision_lnet_net c, "3#{i}"

      configure_docker_network c

      c.vm.provision 'install-iml-local',
            type: 'shell',
            run: 'never',
            path: './scripts/install_iml_local_agent.sh'

      c.vm.provision 'install-lustre-client',
                     type: 'shell',
                     run: 'never',
                     inline: <<-SHELL
                            yum-config-manager --add-repo https://downloads.whamcloud.com/public/lustre/lustre-#{LUSTRE}/el7/client/
                            yum install -y --nogpgcheck lustre-client
                     SHELL
    end
  end
end

def provision_iscsi_net(config, num)
  config.vm.network 'private_network',
                    ip: "#{SUBNET_PREFIX}.40.#{num}",
                    netmask: '255.255.255.0',
                    virtualbox__intnet: 'iscsi-net'

  config.vm.network 'private_network',
                    ip: "#{SUBNET_PREFIX}.50.#{num}",
                    netmask: '255.255.255.0',
                    virtualbox__intnet: 'iscsi-net'
end

def provision_lnet_net(config, num)
  config.vm.network 'private_network',
                    ip: "#{LNET_PFX}.#{num}",
                    netmask: '255.255.255.0',
                    virtualbox__intnet: 'lnet-net'
end

module OS
  def OS.windows?
    (/cygwin|mswin|mingw|bccwin|wince|emx/ =~ RbConfig::CONFIG["host_os"]) != nil
  end

  def OS.mac?
    (/darwin/ =~ RbConfig::CONFIG["host_os"]) != nil
  end

  def OS.unix?
    !OS.windows?
  end

  def OS.linux?
    OS.unix? and not OS.mac?
  end
end

def provision_mgmt_net(config, num)
  interface_name = if OS.windows? then 'VirtualBox Host-Only Ethernet Adapter' else 'vboxnet0' end

  config.vm.network 'private_network',
                    ip: "#{MGMT_NET_PFX}.#{num}",
                    netmask: '255.255.255.0',
                    name: interface_name
end

def provision_mpath(config)
  config.vm.provision 'mpath', type: 'shell', inline: <<-SHELL
    yum -y install device-mapper-multipath
    cp /usr/share/doc/device-mapper-multipath-*/multipath.conf /etc/multipath.conf
    systemctl start multipathd.service
    systemctl enable multipathd.service
  SHELL
end

def provision_fence_agents(config)
  config.vm.provision 'fence-agents', type: 'shell', inline: <<-SHELL
    yum install -y epel-release
    yum install -y yum-plugin-copr
    yum -y copr enable managerforlustre/manager-for-lustre-devel
    yum install -y fence-agents-vbox
    yum -y copr disable managerforlustre/manager-for-lustre-devel
  SHELL
end

def cleanup_storage_server(config)
  config.vm.provision 'cleanup', type: 'shell', run: 'never', inline: <<-SHELL
    yum autoremove -y chroma-agent
    rm -rf /etc/iml
    rm -rf /var/lib/{chroma,iml}
    rm -rf /etc/yum.repos.d/Intel-Lustre-Agent.repo
  SHELL
end

def provision_iscsi_client(config, name, idx)
  config.vm.provision 'iscsi-client', type: 'shell', inline: <<-SHELL
    yum -y install iscsi-initiator-utils lsscsi
    echo "InitiatorName=iqn.2015-01.com.whamcloud:#{name}#{idx}" > /etc/iscsi/initiatorname.iscsi
    iscsiadm --mode discoverydb --type sendtargets --portal #{ISCI_IP}:3260 --discover
    iscsiadm --mode node --targetname iqn.2015-01.com.whamcloud.lu:#{name} --portal #{ISCI_IP}:3260 -o update -n node.startup -v automatic
    iscsiadm --mode node --targetname iqn.2015-01.com.whamcloud.lu:#{name} --portal #{ISCI_IP}:3260 -o update -n node.conn[0].startup -v automatic
    iscsiadm --mode node --targetname iqn.2015-01.com.whamcloud.lu:#{name} --portal #{ISCI_IP2}:3260 -o update -n node.startup -v automatic
    iscsiadm --mode node --targetname iqn.2015-01.com.whamcloud.lu:#{name} --portal #{ISCI_IP2}:3260 -o update -n node.conn[0].startup -v automatic
    systemctl start iscsi
  SHELL
end

def configure_lustre_network(config)
  config.vm.provision 'configure-lustre-network',
                      type: 'shell',
                      run: 'never',
                      path: './scripts/configure_lustre_network.sh'
end

def install_lustre_zfs(config)
  config.vm.provision 'install-lustre-zfs', type: 'shell', run: 'never', inline: <<-SHELL
    yum clean all
    yum install -y --nogpgcheck lustre-zfs
    genhostid
  SHELL
end

def install_lustre_ldiskfs(config)
  config.vm.provision 'install-lustre-ldiskfs',
                      type: 'shell',
                      run: 'never',
                      inline: 'yum install -y lustre-ldiskfs'
end

def install_ldiskfs_no_iml(config)
  config.vm.provision 'install-ldiskfs-no-iml',
                      type: 'shell',
                      run: 'never',
                      reboot: true,
                      path: './scripts/install_ldiskfs_no_iml.sh',
                      args: "#{LUSTRE}"
end

def install_zfs_no_iml(config)
  config.vm.provision 'install-zfs-no-iml',
                      type: 'shell',
                      run: 'never',
                      reboot: true,
                      path: './scripts/install_zfs_no_iml.sh',
                      args: "#{LUSTRE}"
end

def use_vault_7_6_1810(config)
  config.vm.provision 'use-vault-7-6-1810',
                      type: 'shell',
                      run: 'never',
                      path: './scripts/use_vault.sh',
                      args: '7.6.1810'
end

def provision_yum_updates(config)
  config.vm.provision 'yum-update',
                     type: 'shell',
                     run: 'never',
                     inline: 'yum clean metadata; yum update -y'
end

def get_machine_folder()
  out, err = Open3.capture2e('VBoxManage list systemproperties')
  raise out unless err.exitstatus.zero?

  out.split(/\n/)
      .select { |x| x.start_with? 'Default machine folder:' }
      .map { |x| x.split('Default machine folder:')[1].strip }
      .first
end

def get_vm_name(id)
  out, err = Open3.capture2e('VBoxManage list vms')
  raise out unless err.exitstatus.zero?

  path = path = File.dirname(__FILE__).split('/').last
  name = out.split(/\n/)
            .select { |x| x.start_with? "\"#{path}_#{id}" }
            .map { |x| x.tr('"', '') }
            .map { |x| x.split(' ')[0].strip }
            .first

  name
end

# Checks if a scsi controller exists.
# This is used as a predicate to create controllers,
# as vagrant does not provide this
# functionality by default.
def controller_exists(name, controller_name)
  return false if name.nil?

  out, err = Open3.capture2e("VBoxManage showvminfo #{name}")
  raise out unless err.exitstatus.zero?

  out.split(/\n/)
     .select { |x| x.start_with? 'Storage Controller Name' }
     .map { |x| x.split(':')[1].strip }
     .any? { |x| x == controller_name }
end

# Creates a SATA Controller and attaches 10 disks to it
def create_iscsi_disks(vbox, name)
  unless controller_exists(name, 'SATA Controller')
    vbox.customize ['storagectl', :id,
                    '--name', 'SATA Controller',
                    '--add', 'sata']
  end

  dir = "#{get_machine_folder()}/vdisks"
  FileUtils.mkdir_p dir unless File.directory?(dir)

  osts = (1..20).map { |x| ["OST#{x}_", '5120'] }

  [
    %w[mgt_ 512],
    %w[mdt1_ 5120],
    %w[mdt2_ 5120],
    %w[mdt3_ 5120],
  ].concat(osts).each_with_index do |(name, size), i|
    file_to_disk = "#{dir}/#{name}.vdi"
    port = (i + 1).to_s

    unless File.exist?(file_to_disk)
      vbox.customize ['createmedium',
                      'disk',
                      '--filename',
                      file_to_disk,
                      '--size',
                      size,
                      '--format',
                      'VDI',
                      '--variant',
                      'standard']
    end

    vbox.customize ['storageattach', :id,
                    '--storagectl', 'SATA Controller',
                    '--port', port,
                    '--type', 'hdd',
                    '--medium', file_to_disk,
                    '--device', '0']

    vbox.customize ['setextradata', :id,
                    "VBoxInternal/Devices/ahci/0/Config/Port#{port}/SerialNumber",
                    name.ljust(20, '0')]
  end
end

def configure_docker_network(config)
  config.trigger.before :provision, name: 'configure-docker-network-trigger' do |t|
    t.ruby do |_, machine|
      if ARGV[3] == 'configure-docker-network'
        puts 'Copying identify file to job scheduler container.'
        puts `docker ps --format '{{.Names}}' | grep job-scheduler | xargs -I {} docker exec {} sh -c 'mkdir -p /root/.ssh'`
        puts `docker ps --format '{{.Names}}' | grep job-scheduler | xargs -I {} docker cp id_rsa {}:/root/.ssh`
        puts "Writing authorized keys to #{machine.name}"
        puts `cat ~/.ssh/id_rsa.pub | ssh -i ./.vagrant/machines/#{machine.name}/virtualbox/private_key vagrant@#{machine.name} \
             "cat > /tmp/id_rsa.pub && sudo su - -c 'mkdir -p /root/.ssh && touch /root/.ssh/authorized_keys && cat /tmp/id_rsa.pub \
             >> /root/.ssh/authorized_keys && rm -f /tmp/id_rsa.pub'"`
      end
    end
  end

  config.vm.provision 'configure-docker-network', type: 'shell', run: 'never', inline: <<-SHELL
    echo "10.73.10.1 nginx" >> /etc/hosts
  SHELL
end

def configure_ntp(config)
  config.vm.provision 'configure-ntp',
                         type: 'shell',
                         run: 'never',
                         path: 'scripts/configure_ntp.sh',
                         args: ["adm.local"]
end

def configure_ntp_docker(config)
  config.vm.provision 'configure-ntp-docker',
                         type: 'shell',
                         run: 'never',
                         path: 'scripts/configure_ntp.sh',
                         args: ["10.73.10.1"]
end

def wait_for_ntp(config)
  config.vm.provision 'wait-for-ntp',
                          type: 'shell',
                          run: 'never',
                          path: 'scripts/wait_for_ntp.sh',
                          args: ["adm.local"]
end

def wait_for_ntp_docker(config)
  config.vm.provision 'wait-for-ntp-docker',
                          type: 'shell',
                          run: 'never',
                          path: 'scripts/wait_for_ntp.sh',
                          args: ["10.73.10.1"]
end

def create_iml_diagnostics(config)
  config.vm.provision 'create-iml-diagnostics',
                          type: 'shell',
                          run: 'never',
                          path: 'scripts/create_iml_diagnostics.sh',
                          args: ["10.73.10.1"]
end

Debug output

I'm attaching immediate information regarding the error and will follow up with a link to a full VBox log file.

 AIOMgr: Flush failed with VERR_INVALID_PARAMETER, disabling async flushes
...
00:15:09.141920 Console: Machine state changed to 'Stopping'
00:15:09.142240 Console::powerDown(): A request to power off the VM has been issued (mMachineState=Stopping, InUninit=0)
00:15:09.142526 Changing the VM state from 'RUNNING' to 'POWERING_OFF'
...
00:15:09.146297 PDMR3PowerOff: after     3 ms, 1 loops: 1 async tasks - ahci/0
63:55:21.294952 ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={872da645-4a9b-1727-bee2-5585105b9eed} aComponent={ConsoleWrap} aText={The virtual machine is being powered down}, preserve=false aResultDetail=0

Expected behavior

Running vagrant destroy -f iscsi should clean all vm resources (in my case SATA controllers and hdds) and unregister the iscsi vm.

Actual behavior

The state changes from RUNNING to POWERING OFF but will sit in this state indefinitely. The SATA storage controllers are still attached and the hard disks are not removed.

Steps to reproduce

This is an intermittent problem and only happens occasionally. To be clear, I have not yet been able to narrow this issue down to vagrant or virtualbox, but the end result is that in some cases, destroying the iscsi vm will hang at powering off. It appears that the media has not been cleaned up and the vm hangs indefinitely. To reproduce, we do the following:

  1. vagrant up iscsi adm mds1 mds2 oss1 oss2
  2. use the vms for testing
  3. vagrant destroy -f <--- iscsi may change from the RUNNING state to POWERING OFF state and hang at this step.

Between each test run we take the following additional steps after destroying all vagrant nodes:

  1. vagrant global-status --prune
  2. Use vboxmanage to see if there are any running vms. If so, use vboxmanage to power them off.
  3. If there were any running vms in step 5 use vboxmanage to unregister them
  4. remove anything leftover in the virtualbox machine folder

I suspect that when the destroy command is issued to the iscsi vm it is holding onto a resource (we are using this vm as an iscsi server and there are many hdd's using a SATA storage controller on this vm). I just pushed a patch that will run all vagrant commands with VAGRANT_LOG=debug so I will try to get more information soon. Do you recommend adding any other debugging that will help identify the cause?

References

jbonhag commented 4 years ago

Hi @johnsonw,

Thanks for opening an issue with Vagrant! I tried bringing up the iscsi machine from your Vagrantfile but unfortunately was not able to replicate the issue when destroying the VM. It seems like something is preventing the shutdown from completing before the machine can be deregistered.

Would you be willing to try powering off the VM manually, either by issuing a shutdown command from inside the guest or through the VirtualBox GUI?

Also, if you can provide us with that debug log, that would be extremely helpful in investigating the issue further. If you can create a minimal Vagrantfile that demonstrates the issue, that would be even better.

Thanks!

johnsonw commented 4 years ago

Hi @jbonhag,

Thank you for taking the time to look into this.

It seems like something is preventing the shutdown from completing before the machine can be deregistered.

Yes, it seems to be getting hung up on a resource for some reason. The iscsi vm sets up multiple SATA controllers along with 20+ virtual drives (most around 5gb). In most cases, destroying the vm works just fine. In fact, I ran 15 tests today in which none of them encountered this issue. To give a little background, we are using vagrant to provision multiple vm's to run integration tests for the Lustre filesystem.

Would you be willing to try powering off the VM manually, either by issuing a shutdown command from inside the guest or through the VirtualBox GUI?

Absolutely. We are running vagrant and virtualbox on CentOS 7.7 without a gui so all non-vagrant commands are being issued via vboxmanage. It seems that I need to do the following:

  1. Detach all SATA storage controllers manually
  2. Remove all medium manually
  3. Power off the iscsi vm
  4. Unregister the iscsi vm

Also, if you can provide us with that debug log, that would be extremely helpful in investigating the issue further. If you can create a minimal Vagrantfile that demonstrates the issue, that would be even better.

I will definitely be able to provide you with a log file as soon as i'm able to re-create the issue. Regarding a minimal vagrantfile I will see what I can do. I understand that will make it much easier to diagnose. Thanks again for taking a look. I'll post more information as soon as I get it.

Regards,

Will

johnsonw commented 4 years ago

Hello @jbonhag,

I just wanted to follow up with some more information as I encountered the issue again this morning.

Current state:

Current machine states:

iscsi                     stopping (virtualbox)
adm                       poweroff (virtualbox)
mds1                      running (virtualbox)
mds2                      running (virtualbox)
oss1                      running (virtualbox)
oss2                      running (virtualbox)
c1                        not created (virtualbox)
c2                        not created (virtualbox)
c3                        not created (virtualbox)
c4                        not created (virtualbox)
c5                        not created (virtualbox)
c6                        not created (virtualbox)
c7                        not created (virtualbox)
c8                        not created (virtualbox)

All Running VM's:

 vboxmanage list vms
"centos-7-1-1.x86_64_1589450766625_52868" {d6370541-2eb4-4197-8cb8-d8d241478def}
"vagrant_adm_1589450777273_85076" {1963cde6-8543-4149-9e3b-5d0040276680}
"vagrant_iscsi_1589451087612_56233" {f0e37319-5340-4f54-84ea-1ea899bfa725}
"mds1" {549239d8-eada-48fc-ac63-120ddf30e61a}
"mds2" {0e538a4a-49d9-4418-ad30-7e6bc6dd71d6}
"oss1" {a3d701a8-c58a-41e8-832d-beb9eebbde35}
"oss2" {cff353d3-d977-4595-85ff-d85bbefd8bc8}

ISCSI VM Info:

# vboxmanage showvminfo f0e37319-5340-4f54-84ea-1ea899bfa725
Name:                        vagrant_iscsi_1589451087612_56233
Groups:                      /
Guest OS:                    Red Hat (64-bit)
UUID:                        f0e37319-5340-4f54-84ea-1ea899bfa725
Config file:                 /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/vagrant_iscsi_1589451087612_56233.vbox
Snapshot folder:             /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots
Log folder:                  /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Logs
Hardware UUID:               f0e37319-5340-4f54-84ea-1ea899bfa725
Memory size                  1024MB
Page Fusion:                 disabled
VRAM size:                   16MB
CPU exec cap:                100%
HPET:                        disabled
CPUProfile:                  host
Chipset:                     piix3
Firmware:                    BIOS
Number of CPUs:              4
PAE:                         enabled
Long Mode:                   enabled
Triple Fault Reset:          disabled
APIC:                        enabled
X2APIC:                      enabled
Nested VT-x/AMD-V:           disabled
CPUID Portability Level:     0
CPUID overrides:             None
Boot menu mode:              message and menu
Boot Device 1:               Floppy
Boot Device 2:               DVD
Boot Device 3:               HardDisk
Boot Device 4:               Not Assigned
ACPI:                        enabled
IOAPIC:                      enabled
BIOS APIC mode:              APIC
Time offset:                 0ms
RTC:                         UTC
Hardware Virtualization:     enabled
Nested Paging:               enabled
Large Pages:                 disabled
VT-x VPID:                   enabled
VT-x Unrestricted Exec.:     enabled
Paravirt. Provider:          Default
Effective Paravirt. Prov.:   KVM
State:                       stopping (since 2020-05-14T11:49:07.619000000)
Graphics Controller:         VBoxVGA
Monitor count:               1
3D Acceleration:             disabled
2D Video Acceleration:       disabled
Teleporter Enabled:          disabled
Teleporter Port:             0
Teleporter Address:
Teleporter Password:
Tracing Enabled:             disabled
Allow Tracing to Access VM:  disabled
Tracing Configuration:
Autostart Enabled:           disabled
Autostart Delay:             0
Default Frontend:
VM process priority:         default
Storage Controller Name (0):            IDE
Storage Controller Type (0):            PIIX4
Storage Controller Instance Number (0): 0
Storage Controller Max Port Count (0):  2
Storage Controller Port Count (0):      2
Storage Controller Bootable (0):        on
Storage Controller Name (1):            SATA Controller
Storage Controller Type (1):            IntelAhci
Storage Controller Instance Number (1): 0
Storage Controller Max Port Count (1):  30
Storage Controller Port Count (1):      30
Storage Controller Bootable (1):        on
IDE (0, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{d3c2ab24-f96a-47d1-ae89-a76483c2637d}.vmdk (UUID: d3c2ab24-f96a-47d1-ae89-a76483c2637d)
SATA Controller (1, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{c3e871d5-eac0-44c1-b2e2-ba8344476243}.vdi (UUID: c3e871d5-eac0-44c1-b2e2-ba8344476243)
SATA Controller (2, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{4b54e623-0786-4245-b002-8d83fdc687b7}.vdi (UUID: 4b54e623-0786-4245-b002-8d83fdc687b7)
SATA Controller (3, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{37d82aa8-ad45-492e-ae6e-b2d35975ac8e}.vdi (UUID: 37d82aa8-ad45-492e-ae6e-b2d35975ac8e)
SATA Controller (4, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{d38fe751-6cb9-4451-9b8b-3f3dc5d9f97c}.vdi (UUID: d38fe751-6cb9-4451-9b8b-3f3dc5d9f97c)
SATA Controller (5, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{bc438163-7211-4b34-b8a3-c4eb40cabf4b}.vdi (UUID: bc438163-7211-4b34-b8a3-c4eb40cabf4b)
SATA Controller (6, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{ea778a88-6737-4a28-a108-b1223e134612}.vdi (UUID: ea778a88-6737-4a28-a108-b1223e134612)
SATA Controller (7, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{e2e2d982-69e7-4c93-9479-6108bd216a9e}.vdi (UUID: e2e2d982-69e7-4c93-9479-6108bd216a9e)
SATA Controller (8, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{6879a5a4-c0c3-4959-ac6e-46c4cae92dd2}.vdi (UUID: 6879a5a4-c0c3-4959-ac6e-46c4cae92dd2)
SATA Controller (9, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{752abca8-5b7c-4dfe-8e68-007898f26f8d}.vdi (UUID: 752abca8-5b7c-4dfe-8e68-007898f26f8d)
SATA Controller (10, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{44284e5b-a1d7-45e9-ba89-38762678be2f}.vdi (UUID: 44284e5b-a1d7-45e9-ba89-38762678be2f)
SATA Controller (11, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{621f2941-df3f-4f75-8d02-46df470be8c2}.vdi (UUID: 621f2941-df3f-4f75-8d02-46df470be8c2)
SATA Controller (12, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{a25d1fba-d820-4cb3-9ee2-d3ead9b563e2}.vdi (UUID: a25d1fba-d820-4cb3-9ee2-d3ead9b563e2)
SATA Controller (13, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{894f4abd-ce35-494a-a732-62dfc00a2f9f}.vdi (UUID: 894f4abd-ce35-494a-a732-62dfc00a2f9f)
SATA Controller (14, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{08220246-b944-42e0-b276-17a479f47bf9}.vdi (UUID: 08220246-b944-42e0-b276-17a479f47bf9)
SATA Controller (15, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{07afea7a-37a7-471b-b6d4-cfbd8c8633d1}.vdi (UUID: 07afea7a-37a7-471b-b6d4-cfbd8c8633d1)
SATA Controller (16, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{add5fc9f-8cb1-4685-8254-bff8aac0ea9d}.vdi (UUID: add5fc9f-8cb1-4685-8254-bff8aac0ea9d)
SATA Controller (17, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{a6ceb742-7b8d-4911-b5aa-275e20ecfdf1}.vdi (UUID: a6ceb742-7b8d-4911-b5aa-275e20ecfdf1)
SATA Controller (18, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{e9bd8017-62d2-4d3d-bed8-6cca940b4b78}.vdi (UUID: e9bd8017-62d2-4d3d-bed8-6cca940b4b78)
SATA Controller (19, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{514d8dc1-8afb-4269-8157-9127e84a59b1}.vdi (UUID: 514d8dc1-8afb-4269-8157-9127e84a59b1)
SATA Controller (20, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{a8ca16b9-9c1e-4cfa-9694-3ff4fa2b8de1}.vdi (UUID: a8ca16b9-9c1e-4cfa-9694-3ff4fa2b8de1)
SATA Controller (21, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{b086124c-3d81-4460-9022-24673a3df295}.vdi (UUID: b086124c-3d81-4460-9022-24673a3df295)
SATA Controller (22, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{cfb032f9-a0f3-4a48-bfdf-509976375441}.vdi (UUID: cfb032f9-a0f3-4a48-bfdf-509976375441)
SATA Controller (23, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{bf11010b-5a4d-46be-b710-7c8851402b9e}.vdi (UUID: bf11010b-5a4d-46be-b710-7c8851402b9e)
SATA Controller (24, 0): /root/VirtualBox VMs/vagrant_iscsi_1589451087612_56233/Snapshots/{249cd07f-44bb-4ff0-b87c-2b91de3d7df4}.vdi (UUID: 249cd07f-44bb-4ff0-b87c-2b91de3d7df4)
NIC 1:                       MAC: 525400FD3221, Attachment: NAT, Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 1 Settings:  MTU: 0, Socket (send: 64, receive: 64), TCP Window (send:64, receive: 64)
NIC 1 Rule(0):   name = ssh, protocol = tcp, host ip = 127.0.0.1, host port = 2222, guest ip = , guest port = 22
NIC 2:                       MAC: 080027F2ED14, Attachment: Internal Network 'iscsi-net', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 3:                       MAC: 080027CF2089, Attachment: Internal Network 'iscsi-net', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 4:                       disabled
NIC 5:                       disabled
NIC 6:                       disabled
NIC 7:                       disabled
NIC 8:                       disabled
Pointing Device:             PS/2 Mouse
Keyboard Device:             PS/2 Keyboard
UART 1:                      disabled
UART 2:                      disabled
UART 3:                      disabled
UART 4:                      disabled
LPT 1:                       disabled
LPT 2:                       disabled
Audio:                       disabled
Audio playback:              disabled
Audio capture:               disabled
Clipboard Mode:              disabled
Drag and drop Mode:          disabled
Session name:                headless
Video mode:                  720x400x32 at 0,0 enabled
VRDE:                        disabled
OHCI USB:                    disabled
EHCI USB:                    disabled
xHCI USB:                    disabled

USB Device Filters:

<none>

Available remote USB devices:

<none>

Currently Attached USB Devices:

<none>

Bandwidth groups:  <none>

Shared folders:<none>

VRDE Connection:             not active
Clients so far:              0

Capturing:                   not active
Capture audio:               not active
Capture screens:
Capture file:                /root/VirtualBox VMs/temp_clone_1589451086236_4949/temp_clone_1589451086236_4949.webm
Capture dimensions:          1024x768
Capture rate:                512kbps
Capture FPS:                 25kbps
Capture options:

Guest:

Configured memory balloon size: 0MB
OS type:                     RedHat_64
Additions run level:         0

Guest Facilities:

No active facilities.

Snapshots:

   Name: bare (UUID: 0af561a0-bd5f-4f1e-8fbe-7b1d0e2c8d72)
      Name: iml-installed (UUID: acf3acb4-9869-4faf-9be0-14a4e5813a84)
         Name: servers-deployed (UUID: c6f8e61d-27e6-44a4-9103-8dc01e51f976) *

Notice the state of the VM is "Stopping" (It's been this way for over an hour). So I generated a virtualbox report to see what I could find. As far as the iscsi node is concerned, I didn't find anything that stood out. However, the VBoxSVC.log (see attachment) had some very interesting information:

05:32:26.351475 nspr-3   ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={85632c68-b5bb-4316-a900-5eb28d3413df} aComponent={SessionMachine} aText={No storage device attached to device slot 1 on port 1 of controller 'IDE'}, preserve=false aResultDetail=0

There are many errors in the log regarding 85632c68-b5bb-4316-a900-5eb28d3413df. If I look at the current list of virtual hdd's I can't find this ID referenced anywhere. If I understand correctly, it looks like this is the resource the service is getting hung up on; i'm just not able to determine why it's getting hung up here. Since i'm seeing this when running vagrant destroy, I'm wondering if the command is not cleaning up all of the resources. Please see the attached vbox bug report:

2020-05-14-15-54-24-bugreport.zip

Please let me know what you think.

Regards,

Will

johnsonw commented 4 years ago

Good afternoon! Just to follow up, I am running vagrant destroy -f so destruction is forced on each node. Do you feel that it would be safer to do a vagrant destroy as opposed to forcing the destroy?

jbonhag commented 4 years ago

Hi @johnsonw,

Thanks for the additional information. I took a look at those logs but I couldn't find the source of that unknown ID either. Unfortunately we still haven't been able to replicate the issue.

However, all is not lost. You might be able to automate the disk cleanup by using a typed trigger. A trigger is a small bit of code that runs before or after a Vagrant action. In this case, you can add an action trigger to run just before the machine is deregistered:

iscsi.trigger.before :"VagrantPlugins::ProviderVirtualBox::Action::Destroy" do |trigger|
  trigger.run = {path: "./scripts/cleanup_disks.sh"}
end

If you save these steps as scripts/cleanup_disks.sh:

  1. Detach all SATA storage controllers manually
  2. Remove all medium manually

then Vagrant will run them on the iscsi machine automatically when you perform a vagrant destroy. One caveat: typed triggers are experimental, so you will have to set the environment variable VAGRANT_EXPERIMENTAL="typed_triggers" in order to use them.

The only difference between vagrant destroy -f and vagrant destroy is that adding the -f option bypasses the [y/N] confirmation. It shouldn't have any effect on the success of the destruction.

jbonhag commented 4 years ago

Hey there,

I am going to close this issue as it doesn't seem to be caused by Vagrant. If you're able to provide a reproducible Vagrantfile, we'd be happy to take another look at the issue. Thanks! 😄

johnsonw commented 4 years ago

Hi @jbonhag,

Sure, makes sense. In case it helps others, we are now suspending all nodes before destroying them. I can't say why this makes a difference but we haven't seen the issue since we've made this change. Thanks again for your time.

Regards,

Will

jbonhag commented 4 years ago

Thanks for understanding, Will. Happy to hear you found a workaround to the problem. Cheers!

ghost commented 4 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.