josenk / vagrant-vmware-esxi

A Vagrant plugin that adds a vmware ESXi provider support.
GNU General Public License v3.0
415 stars 106 forks source link

Unable to get list of Disk Stores #41

Closed haiderim closed 6 years ago

haiderim commented 6 years ago

vagrant version

Installed Version: 2.1.1
Latest Version: 2.1.1

You're running an up-to-date version of Vagrant!

lsb_release -a

LSB Version:    :core-4.1-amd64:core-4.1-noarch
Distributor ID: Fedora
Description:    Fedora release 27 (Twenty Seven)
Release:    27
Codename:   TwentySeven

My Vagrantfile

#
#  Fully documented Vagrantfile available
#  in the wiki:  https://github.com/josenk/vagrant-vmware-esxi/wiki
Vagrant.configure('2') do |config|

  #  Box, Select any box created for VMware that is compatible with
  #    the ovftool.  To get maximum compatibility You should download
  #    and install the latest version of ovftool for your OS.
  #    https://www.vmware.com/support/developer/ovf/
  #
  #    If your box is stuck at 'Powered On', then most likely
  #    the box/vm doesn't have the vmware tools installed.
  #
  # Here are some of the MANY examples....
  config.vm.box = 'centos/7'
  #config.vm.box = 'generic/centos6'
  #config.vm.box = 'generic/fedora27'
  #config.vm.box = 'generic/freebsd11'
  #config.vm.box = 'generic/ubuntu1710'
  #config.vm.box = 'generic/debian9'
  #config.vm.box = 'hashicorp/precise64'
  #config.vm.box = 'steveant/CentOS-7.0-1406-Minimal-x64'
  #config.vm.box = 'geerlingguy/centos7'
  #config.vm.box = 'geerlingguy/ubuntu1604'
  #config.vm.box = 'laravel/homestead'
  #config.vm.box = 'puphpet/debian75-x64'

  #  Use rsync and NFS synced folders. (or use the option to disable them)
  #    https://www.vagrantup.com/docs/synced-folders/
  #config.vm.synced_folder('.', '/vagrant', type: 'rsync')
  config.vm.synced_folder('.', '/vagrant', type: 'nfs', disabled: true)

  #  Vagrant can configure additional network interfaces using a static IP or
  #  DHCP. Use public_network or private_network to manually set a static IP and
  #  optionally netmask.  ESXi doesn't use the concept of public or private
  #  networks so both are valid here.  The primary network interface is considered the
  #  "vagrant management" interface and cannot be changed and this plugin
  #  supports 4 NICS, so you can specify 3 entries here!
  #
  #  https://www.vagrantup.com/docs/networking/public_network.html
  #  https://www.vagrantup.com/docs/networking/private_network.html
  #
  #    *** Invalid settings could cause 'vagrant up' to fail ***
  #config.vm.network 'private_network', ip: '192.168.10.170', netmask: '255.255.255.0'
  #config.vm.network 'private_network', ip: '192.168.11.170'
  #config.vm.network 'public_network', ip: '192.168.12.170'

  #
  #  Provider (esxi) settings
  #
  config.vm.provider :vmware_esxi do |esxi|

    #  REQUIRED!  ESXi hostname/IP
    esxi.esxi_hostname = '192.168.1.110'

    #  ESXi username
    esxi.esxi_username = 'root'

    #  IMPORTANT!  Set ESXi password.
    #    1) 'prompt:'
    #    2) 'file:'  or  'file:my_secret_file'
    #    3) 'env:'  or 'env:my_secret_env_var'
    #    4) 'key:'  or  key:~/.ssh/some_ssh_private_key'
    #    5) or esxi.esxi_password = 'my_esxi_password'
    #
    esxi.esxi_password = 'esxi@admin'

    #  SSH port.
    #esxi.esxi_hostport = 22

    #  HIGHLY RECOMMENDED!  ESXi Virtual Network
    #    You should specify an ESXi Virtual Network!  If it's not specified, the
    #    default is to use the first found.  You can specify up to 4 virtual
    #    networks using an array format.
    #esxi.esxi_virtual_network = ['VM Network','VM Network2','VM Network3','VM Network4']

    #  OPTIONAL.  Specify a Disk Store
    esxi.esxi_disk_store = 'DS-1'

    #  OPTIONAL.  Resource Pool
    #     Vagrant will NOT create a Resource pool it for you.
    #esxi.esxi_resource_pool = '/Vagrant'

    #  Optional. Specify a VM to clone instead of uploading a box.
    #    Vagrant can use any stopped VM as the source 'box'.   The VM must be
    #    registered, stopped and must have the vagrant insecure ssh key installed.
    #    If the VM is stored in a resource pool, it must be specified.
    #    See wiki: https://github.com/josenk/vagrant-vmware-esxi/wiki/How-to-clone_from_vm
    #esxi.clone_from_vm = 'resource_pool/source_vm'

    #  OPTIONAL.  Guest VM name to use.
    #    The Default will be automatically generated.
    esxi.guest_name = 'IN01-V-Cent7'

    #  OPTIONAL.  When automatically naming VMs, use this prifix.
    #esxi.guest_name_prefix = 'V-'

    #  OPTIONAL.  Set the guest username login.  The default is 'vagrant'.
    esxi.guest_username = 'dragon'

    #  OPTIONAL.  Memory size override
    esxi.guest_memsize = '4096'

    #  OPTIONAL.  Virtual CPUs override
    esxi.guest_numvcpus = '2'

    #  OPTIONAL & RISKY.  Specify up to 4 MAC addresses
    #    The default is ovftool to automatically generate a MAC address.
    #    You can specify an array of MAC addresses using upper or lower case,
    #    separated by colons ':'.
    #esxi.guest_mac_address = ['00:50:56:aa:bb:cc', '00:50:56:01:01:01','00:50:56:02:02:02','00:50:56:BE:AF:01' ]

    #   OPTIONAL & RISKY.  Specify a guest_nic_type
    #     The validated list of guest_nic_types are 'e1000', 'e1000e', 'vmxnet',
    #     'vmxnet2', 'vmxnet3', 'Vlance', and 'Flexible'.
    esxi.guest_nic_type = 'vmxnet3'

    #  OPTIONAL. Specify a disk type.
    #    If unspecified, it will be set to 'thin'.  Otherwise, you can set to
    #    'thin', 'thick', or 'eagerzeroedthick'
    esxi.guest_disk_type = 'thin'

    #  OPTIONAL. Boot disk size.
    #    If unspecified, the boot disk size will be the same as the original
    #    box.  You can specify a larger boot disk size in GB.  The extra disk space
    #    will NOT automatically be available to your OS.  You will need to
    #    create or modify partitions, LVM and/or filesystems.
    esxi.guest_boot_disk_size = 40

    #  OPTIONAL.  Create additional storage for guests.
    #    You can specify an array of up to 13 virtual disk sizes (in GB) that you
    #    would like the provider to create once the guest has been created.
    #esxi.guest_storage = [10,20]

    #  OPTIONAL. specify snapshot options.
    #esxi.guest_snapshot_includememory = 'true'
    #esxi.guest_snapshot_quiesced = 'true'

    #  RISKY. guest_guestos
    #    https://github.com/josenk/vagrant-vmware-esxi/ESXi_guest_guestos_types.md
    #esxi.guest_guestos = 'centos-64'

    #  OPTIONAL. guest_virtualhw_version
    #    ESXi 6.5 supports these versions. 4,7,8,9,10,11,12 & 13.
    esxi.guest_virtualhw_version = '13'

    #  RISKY. guest_custom_vmx_settings
    #esxi.guest_custom_vmx_settings = [['vhv.enable','TRUE'], ['floppy0.present','TRUE']]

    #  OPTIONAL. local_lax
    #esxi.local_lax = 'true'

    #  OPTIONAL. Guest IP Caching
    #esxi.local_use_ip_cache = 'True'

    #  DANGEROUS!  Allow Overwrite
    #    If unspecified, the default is to produce an error if overwriting
    #    vm's and packages.
    #esxi.local_allow_overwrite = 'True'

    #  Plugin debug output.
    #    Please send any bug reports with debug this output...
    esxi.debug = 'true ip vmx'

  end
end

vagrant up --provider=vmware_esxi


RUBY_PLATFORM: x86_64-linux
Testing esxi connectivity
==> default: --- ESXi version    : VMware ESXi 6.5.0 build-7967591
Bringing machine 'default' up with 'vmware_esxi' provider...
==> default: Virtual Machine will be built.
VMware ovftool 4.3.0 (build-7948156)
==> default: --- Avail DS vols   : []
There was an error talking to ESXi.
  Unable to get list of Disk Stores:

vagrant ssh


RUBY_PLATFORM: x86_64-linux
Testing esxi connectivity
==> default: --- ESXi version    : VMware ESXi 6.5.0 build-7967591
The provider for this Vagrant-managed machine is reporting that it
is not yet ready for SSH. Depending on your provider this can carry
different meanings. Make sure your machine is created and running and
try again. Additionally, check the output of `vagrant status` to verify
that the machine is in the state that you expect. If you continue to
get this error message, please view the documentation for the provider
you're using.

vagrant status

RUBY_PLATFORM: x86_64-linux
Testing esxi connectivity
==> default: --- ESXi version    : VMware ESXi 6.5.0 build-7967591
Current machine states:

default                   not created (vmware_esxi)

No Virtual Machines are created.

I don't know what's wrong here as I can ssh into the host just fine.

josenk commented 6 years ago

The plugin is unable to find any Disk Stores... Do you have any configured?

ssh to your esxi box and run the following commands. Update here with the results.

# esxcli storage filesystem list
# df
haiderim commented 6 years ago

It's indeed weird, I can't query the datastore using either esxcli or df but can browse it using web view. I'll try another host then and report back.

haiderim commented 6 years ago

So I rebooted the host and it worked. Thanks for the pointing it out, I never would've guessed as everything looked ok in webview.

josenk commented 6 years ago

Thanks for confirming your work-around. (The reboot) I've seen this problem a few times now and a reboot has always fixed the problem. Right now I'm guessing there was IO disk errors on one or more of your Disk Stores that's creating this problem, but I'm not 100% sure.

I'm interested to see if a rescan fixes this issue. If you or anyone has a similar issue (one or more Disk Stores are not see by esxcli and/or df command, please run the rescan option and update this thread with the details.

esxcli storage filesystem rescan
johnoooo commented 6 years ago

I observed the same behavior on one of my ESXi hosts.

To me, it happened randomly, even in a multi-machine setup. It happened, e.g., for 2 out of 5 machines, each with a similar setup. Re-running the deployment a few times then lead to a success. So, the behavior is not reproducible in a deterministic manner.

The error message of the command esxcli storage filesystem list says Error on command storage filesystem list. Error was Invalid name specified for MetaStructure: 'FilesystemVolume'

I got that message by intercepting esxcli storage filesystem list in the createvm.rb action with a tee command as follows

            #  Figure out DataStore
            r = ssh.exec!(
                    'esxcli storage filesystem list | tee "$(mktemp -p /tmp)" | grep "/vmfs/volumes/.*[VMFS|NFS]" | '\
                    "sort -nk7 | awk '{print $2}'")

I never saw it when directly SSHing to the esxi host and issuing it as it is proposed by @josenk above.

When googling the error message, you find one direct hint on VMWARE (here), which leads to a patch. Meanwhile, I installed all patches since ESXi 5.5u1 release and the host now seems to behave well. I ran 5 deployments of that multi-machine setup with two reboots of the host. The failure did not show-up again. I will keep you updated. Keep your fingers crossed :-)

johnoooo commented 6 years ago

p.s. I observed this behavior only on one of my ESXi hosts, which runs on ESXi 5.5 on an HP ProLiant DL580 G5. I cannot remember having it seen on the other host, which uses now ESXi 6.7 and an ESXi 6.5.0a before. I can see from the ESXi build number of the case above that is was with an ESXi 6.5 U1g, which is in between my 6.5.0a and the 6.7. So, I am not sure whether the patches and updates really solve the problem.

josenk commented 6 years ago

Great to know a patch could fix the problem...

haiderim commented 6 years ago

@josenk esxcli storage filesystem rescan did fix the problem for me. Sorry for the delayed update