saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:
https://repo.saltproject.io/
Apache License 2.0
14.09k stars 5.47k forks source link

[BUG] salt-cloud show_instance in cloud_created reactor cannot locate newly-created VM #58869

Open ggiesen opened 3 years ago

ggiesen commented 3 years ago

Description When creating a new VM using the vmware driver in salt-cloud, I'm using a reactor python script to fetch VM details using the salt/cloud/*/created reactor. I've tried querying using both the python client API:

(vmware_data = cclient.action(fun="show_instance", instance=data["name"])

as well as the internal salt function:

vmware_data = cclient.action(fun="show_instance", instance=data["name"])

I have four VMware providers defined (4 different DCs), and when running the reactor script, I get:

[DEBUG   ] dc2somehost01 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc2somehost02 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc2somehost03 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc2somehost04 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc2somehost05 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc2somehost06 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc2somehost07 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc2somehost08 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc2somehost09 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc2somehost10 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc3somehost01 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc3somehost02 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc3somehost03 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc3somehost04 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc3somehost05 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc3somehost06 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc3somehost07 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc3somehost08 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc3somehost09 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc3somehost10 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc4somehost01 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc4somehost02 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc4somehost03 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc4somehost04 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc4somehost05 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc4somehost06 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc4somehost07 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc4somehost08 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc4somehost09 in provider:vmware is not in name list:'{'dc1newhost01'}'
[DEBUG   ] dc4somehost10 in provider:vmware is not in name list:'{'dc1newhost01'}'
VMware Data:
{'Not Actioned/Not Running': ['dc1newhost01'],
 'Not Found': ['dc1newhost01']}
[DEBUG   ] This salt-master instance has accepted 61 minion keys.
[DEBUG   ] Could not LazyLoad parallels.avail_sizes: 'parallels' __virtual__ returned False
[DEBUG   ] LazyLoaded parallels.avail_locations
[DEBUG   ] LazyLoaded proxmox.avail_sizes
[DEBUG   ] Could not LazyLoad saltify.optimize_providers: 'saltify.optimize_providers' is not available.
[DEBUG   ] The 'saltify' cloud driver is unable to be optimized.
[DEBUG   ] Could not LazyLoad vmware.optimize_providers: 'vmware.optimize_providers' is not available.
[DEBUG   ] The 'vmware' cloud driver is unable to be optimized.
[DEBUG   ] Could not LazyLoad saltify.list_nodes_min: 'saltify.list_nodes_min' is not available.
[DEBUG   ] Could not LazyLoad saltify.list_nodes_min: 'saltify.list_nodes_min' is not available.
[DEBUG   ] Could not LazyLoad parallels.avail_sizes: 'parallels' __virtual__ returned False
[DEBUG   ] LazyLoaded parallels.avail_locations
[DEBUG   ] LazyLoaded proxmox.avail_sizes
[DEBUG   ] Could not LazyLoad parallels.avail_sizes: 'parallels' __virtual__ returned False
[DEBUG   ] Could not LazyLoad parallels.avail_sizes: 'parallels' __virtual__ returned False
[DEBUG   ] LazyLoaded parallels.avail_locations
[DEBUG   ] LazyLoaded parallels.avail_locations
[DEBUG   ] LazyLoaded proxmox.avail_sizes
[DEBUG   ] LazyLoaded proxmox.avail_sizes
[DEBUG   ] Reading configuration from /etc/salt/master
[DEBUG   ] Could not LazyLoad parallels.avail_sizes: 'parallels' __virtual__ returned False
[DEBUG   ] LazyLoaded parallels.avail_locations
[DEBUG   ] Reading configuration from /etc/salt/master
[DEBUG   ] Could not LazyLoad parallels.avail_sizes: 'parallels' __virtual__ returned False
[DEBUG   ] LazyLoaded proxmox.avail_sizes
[DEBUG   ] Including configuration from '/etc/salt/master.d/clmon.conf'
[DEBUG   ] Reading configuration from /etc/salt/master.d/clmon.conf
[DEBUG   ] LazyLoaded parallels.avail_locations
[DEBUG   ] LazyLoaded proxmox.avail_sizes
[DEBUG   ] Including configuration from '/etc/salt/master.d/git_pillar.conf'
[DEBUG   ] Reading configuration from /etc/salt/master.d/git_pillar.conf
[DEBUG   ] Including configuration from '/etc/salt/master.d/gitfs.conf'
[DEBUG   ] Reading configuration from /etc/salt/master.d/gitfs.conf
[DEBUG   ] Including configuration from '/etc/salt/master.d/mon.conf'
[DEBUG   ] Reading configuration from /etc/salt/master.d/mon.conf
[DEBUG   ] Could not LazyLoad parallels.avail_sizes: 'parallels' __virtual__ returned False
[DEBUG   ] LazyLoaded parallels.avail_locations
[DEBUG   ] Including configuration from '/etc/salt/master.d/git_pillar.conf'
[DEBUG   ] LazyLoaded proxmox.avail_sizes
[DEBUG   ] Reading configuration from /etc/salt/master.d/git_pillar.conf
[DEBUG   ] Failed to execute 'vmware.list_nodes_min()' while querying for running nodes: [Errno 32] Broken pipe
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/salt/cloud/__init__.py", line 2367, in run_parallel_map_providers_query
    salt.utils.data.simple_types_filter(cloud.clouds[data["fun"]]()),
  File "/usr/lib/python3.6/site-packages/salt/cloud/clouds/vmware.py", line 1953, in list_nodes_min
    _get_si(), vim.VirtualMachine, vm_properties
  File "/usr/lib/python3.6/site-packages/salt/cloud/clouds/vmware.py", line 265, in _get_si
    url, username, password, protocol=protocol, port=port
  File "/usr/lib/python3.6/site-packages/salt/utils/vmware.py", line 464, in get_service_instance
    service_instance.CurrentTime()
  File "/usr/lib/python3.6/site-packages/pyVmomi/VmomiSupport.py", line 706, in <lambda>
    self.f(*(self.args + (obj,) + args), **kwargs)
  File "/usr/lib/python3.6/site-packages/pyVmomi/VmomiSupport.py", line 512, in _InvokeMethod
    return self._stub.InvokeMethod(self, info, args)
  File "/usr/lib/python3.6/site-packages/pyVmomi/SoapAdapter.py", line 1350, in InvokeMethod
    conn.request('POST', self.path, req, headers)
  File "/usr/lib64/python3.6/http/client.py", line 1254, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib64/python3.6/http/client.py", line 1300, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib64/python3.6/http/client.py", line 1249, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib64/python3.6/http/client.py", line 1036, in _send_output
    self.send(msg)
  File "/usr/lib64/python3.6/http/client.py", line 996, in send
    self.sock.sendall(data)
  File "/usr/lib64/python3.6/ssl.py", line 934, in sendall
    v = self.send(byte_view[count:])
  File "/usr/lib64/python3.6/ssl.py", line 903, in send
    return self._sslobj.write(data)
  File "/usr/lib64/python3.6/ssl.py", line 601, in write
    return self._sslobj.write(data)
BrokenPipeError: [Errno 32] Broken pipe

Regardless of which DC I provision the new VM in, I will get messages about the host not being able to be located in the other 3 DCs.

I can however run salt-cloud to query the instance (while debug is still looping trying to find the VM) and it works perfectly:

salt-cloud -a show_instance dc1newhost01
The following virtual machines are set to be actioned with "show_instance":
  dc1newhost01

Proceed? [N/y] y
... proceeding
YOR:
    ----------
    vmware:
        ----------
        dc1newhost01:
            ----------
            devices:
                ----------
                CD/DVD drive 1:
                    ----------
                    allowGuestControl:
                        True
                    connected:
                        False
                    controllerKey:
                        15000
                    key:
                        16000
                    label:
                        CD/DVD drive 1
                    startConnected:
                        False
                    status:
                        ok
                    summary:
                        Remote ATAPI
                    type:
                        VirtualCdrom
                Hard disk 1:
                    ----------
                    capacityInKB:
                        31457280
                    controllerKey:
                        1000
                    diskMode:
                        persistent
                    fileName:
                        [storage01] dc1newhost01/dc1newhost01.vmdk
                    key:
                        2000
                    label:
                        Hard disk 1
                    summary:
                        31,457,280 KB
                    type:
                        VirtualDisk
                IDE 0:
                    ----------
                    busNumber:
                        0
                    deviceKeys:
                    key:
                        200
                    label:
                        IDE 0
                    summary:
                        IDE 0
                    type:
                        VirtualIDEController
                IDE 1:
                    ----------
                    busNumber:
                        1
                    deviceKeys:
                    key:
                        201
                    label:
                        IDE 1
                    summary:
                        IDE 1
                    type:
                        VirtualIDEController
                Keyboard :
                    ----------
                    controllerKey:
                        300
                    key:
                        600
                    label:
                        Keyboard 
                    summary:
                        Keyboard
                    type:
                        VirtualKeyboard
                Network adapter 1:
                    ----------
                    addressType:
                        assigned
                    allowGuestControl:
                        True
                    connected:
                        True
                    controllerKey:
                        100
                    key:
                        4000
                    label:
                        Network adapter 1
                    macAddress:
                        00:50:56:a1:71:d2
                    startConnected:
                        True
                    status:
                        ok
                    summary:
                        DVSwitch: 50 21 42 5a c6 41 6a eb-8d c5 cc c5 07 0d 2d 6a
                    type:
                        VirtualVmxnet3
                    unitNumber:
                        7
                Network adapter 2:
                    ----------
                    addressType:
                        assigned
                    allowGuestControl:
                        True
                    connected:
                        True
                    controllerKey:
                        100
                    key:
                        4001
                    label:
                        Network adapter 2
                    macAddress:
                        00:50:56:a1:0f:35
                    startConnected:
                        True
                    status:
                        ok
                    summary:
                        DVSwitch: 50 21 42 5a c6 41 6a eb-8d c5 cc c5 07 0d 2d 6a
                    type:
                        VirtualVmxnet3
                    unitNumber:
                        8
                PCI controller 0:
                    ----------
                    busNumber:
                        0
                    deviceKeys:
                        - 500
                        - 12000
                        - 1000
                        - 15000
                        - 4000
                        - 4001
                    key:
                        100
                    label:
                        PCI controller 0
                    summary:
                        PCI controller 0
                    type:
                        VirtualPCIController
                PS2 controller 0:
                    ----------
                    busNumber:
                        0
                    deviceKeys:
                        - 600
                        - 700
                    key:
                        300
                    label:
                        PS2 controller 0
                    summary:
                        PS2 controller 0
                    type:
                        VirtualPS2Controller
                Pointing device:
                    ----------
                    controllerKey:
                        300
                    key:
                        700
                    label:
                        Pointing device
                    summary:
                        Pointing device; Device
                    type:
                        VirtualPointingDevice
                    unitNumber:
                        1
                SATA controller 0:
                    ----------
                    busNumber:
                        0
                    controllerKey:
                        100
                    deviceKeys:
                        - 16000
                    key:
                        15000
                    label:
                        SATA controller 0
                    summary:
                        AHCI
                    type:
                        VirtualAHCIController
                    unitNumber:
                        24
                SCSI controller 0:
                    ----------
                    busNumber:
                        0
                    controllerKey:
                        100
                    deviceKeys:
                        - 2000
                    key:
                        1000
                    label:
                        SCSI controller 0
                    summary:
                        VMware paravirtual SCSI
                    type:
                        ParaVirtualSCSIController
                    unitNumber:
                        3
                SIO controller 0:
                    ----------
                    busNumber:
                        0
                    deviceKeys:
                    key:
                        400
                    label:
                        SIO controller 0
                    summary:
                        SIO controller 0
                    type:
                        VirtualSIOController
                VMCI device:
                    ----------
                    controllerKey:
                        100
                    key:
                        12000
                    label:
                        VMCI device
                    summary:
                        Device on the virtual machine PCI bus that provides support for the virtual machine communication interface
                    type:
                        VirtualVMCIDevice
                    unitNumber:
                        17
                Video card :
                    ----------
                    controllerKey:
                        100
                    key:
                        500
                    label:
                        Video card 
                    summary:
                        Video card
                    type:
                        VirtualVideoCard
                    videoRamSizeInKB:
                        8192
            files:
                ----------
                0:
                    ----------
                    key:
                        0
                    name:
                        [storage01] dc1newhost01/dc1newhost01.vmx
                    size:
                        3107
                    type:
                        config
                1:
                    ----------
                    key:
                        1
                    name:
                        [storage01] dc1newhost01/dc1newhost01.nvram
                    size:
                        8684
                    type:
                        nvram
                2:
                    ----------
                    key:
                        2
                    name:
                        [storage01] dc1newhost01/dc1newhost01.vmsd
                    size:
                        0
                    type:
                        snapshotList
                3:
                    ----------
                    key:
                        3
                    name:
                        [storage01] dc1newhost01/dc1newhost01.vmdk
                    size:
                        582
                    type:
                        diskDescriptor
                4:
                    ----------
                    key:
                        4
                    name:
                        [storage01] dc1newhost01/dc1newhost01-flat.vmdk
                    size:
                        1951482880
                    type:
                        diskExtent
                6:
                    ----------
                    key:
                        6
                    name:
                        [storage01] dc1newhost01/dc1newhost01-07a0a545.vswp
                    size:
                        0
                    type:
                        swap
                7:
                    ----------
                    key:
                        7
                    name:
                        [storage01] dc1newhost01/vmx-dc1newhost01-127968581-1.vswp
                    size:
                        0
                    type:
                        uwswap
            guest_id:
                centos7_64Guest
            hostname:
                dc1newhost01
            id:
                dc1newhost01
            image:
                CentOS 7 (64-bit) (Detected)
            mac_addresses:
                - 00:50:56:a1:71:d2
                - 00:50:56:a1:0f:35
            networks:
                ----------
                pgNET1:
                    ----------
                    connected:
                        True
                    ip_addresses:
                        - 192.0.2.2
                        - fe80::250:56ff:fea1:f34
                    mac_address:
                        00:50:56:a1:0f:34
                pgNET2:
                    ----------
                    connected:
                        True
                    ip_addresses:
                        - 192.0.2.6
                        - fe80::250:56ff:fea1:71d1
                    mac_address:
                        00:50:56:a1:71:d2
            path:
                [storage01] dc1newhost01/dc1newhost01.vmx
            private_ips:
                - 192.0.2.2
                - fe80::250:56ff:fea1:71d1
                - 192.0.2.6
                - fe80::250:56ff:fea1:f34
            public_ips:
            size:
                cpu: 1
                ram: 2048 MB
            size_dict:
                ----------
                cpu:
                    1
                memory:
                    2048 MB
            state:
                poweredOn
            storage:
                ----------
                committed:
                    1951495253
                uncommitted:
                    30260771840
                unshared:
                    1951483462
            tools_status:
                toolsOk

Here's a code snippet from my reactor:


    for _ in range(10):
        try:
            print("Fetching VM information (show_instance) from VSphere")
            cclient = CloudClient(path="/etc/salt/cloud")
            while vmware_data == None or "Not Found" in vmware_data.keys():
                vmware_data = cclient.action(fun="show_instance", instance=data["name"])
#                vmware_data = __salt__["cloud.action"](fun="show_instance", instance=data["name"])
                # Debug code
                print("VMware Data:")
                pprint.pprint(vmware_data)
                time.sleep(5.0)
        except (ConnectionResetError, BrokenPipeError):
            print("VSphere server is being janky on show_instance...waiting 5 secs")
            time.sleep(5.0)
            continue
        else:
            break
    else:
        raise PermanentError()

With the loop I've implemented, it will eventually (20-30 mins later) finally complete running the reactor and provide the requested data

Setup /etc/salt/cloud.providers.d/vmware.conf:

DC1: driver: vmware user: sdb://secret/vsphere:username password: sdb://secret/vsphere:password url: 'dc1.example.com' protocol: 'https' port: 443 DC2: driver: vmware user: sdb://secret/vsphere:username password: sdb://secret/vsphere:password url: 'dc2.example.com' protocol: 'https' port: 443 DC3: driver: vmware user: sdb://secret/vsphere:username password: sdb://secret/vsphere:password url: 'dc3.example.com protocol: 'https' port: 443 DC4: driver: vmware user: sdb://secret/vsphere:username password: sdb://secret/vsphere:password url: 'dc4.example.com' protocol: 'https' port: 443

Steps to Reproduce the behavior Create a new vm using the vmware driver, and then set up a reactor with the code above

Expected behavior VM locates vm and cloud provider returns requested data

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.) ``` Salt Version: Salt: 3001.1 Dependency Versions: cffi: 1.14.0 cherrypy: unknown dateutil: 2.6.1 docker-py: Not Installed gitdb: Not Installed gitpython: Not Installed Jinja2: 2.11.2 libgit2: 1.0.1 M2Crypto: 0.35.2 Mako: Not Installed msgpack-pure: Not Installed msgpack-python: 0.6.2 mysql-python: Not Installed pycparser: 2.20 pycrypto: Not Installed pycryptodome: Not Installed pygit2: 1.3.0 Python: 3.6.8 (default, Apr 16 2020, 01:36:27) python-gnupg: 0.4.6 PyYAML: 5.3.1 PyZMQ: 19.0.0 smmap: Not Installed timelib: Not Installed Tornado: 4.5.3 ZMQ: 4.3.3 System Versions: dist: centos 8 Core locale: UTF-8 machine: x86_64 release: 4.18.0-193.19.1.el8_2.x86_64 system: Linux version: CentOS Linux 8 Core ```
sagetherage commented 3 years ago

58691 and PR https://github.com/saltstack/salt/pull/58803 looks like this is fixed, closing for now, mention me @ggiesen if not

ggiesen commented 3 years ago

@sagetherage Looks like I lied, this is still an issue. Tried firing up a VM today and had the same problem.

sagetherage commented 3 years ago

@dhiltonp we would love your take on this as well as we look at the VMware modules in Salt

sagetherage commented 3 years ago

@ggiesen a few of questions to help troubleshoot, here:

ggiesen commented 3 years ago

@ggiesen a few of questions to help troubleshoot, here:

* what does the /etc/salt/master.d/reactor.conf file look like? can you sanitize and share?

* was the salt master restarted after this file was modified?

* can you create a vm outside of a reactor with only a salt-cloud command such as `salt-cloud -a show_instance dc1newhost01` ?

/etc/salt/master.d/reactor.conf: reactor:

  - 'salt/netapi/hook/netbox/virtualization/virtual-machines':
    - 'salt://reactor/vmware_create.sls'
  - 'salt/cloud/*/creating':
    - 'salt://reactor/cloud_creating.sls'
  - 'salt/cloud/*/requesting':
    - salt://reactor/cloud_requesting.sls
  - 'salt/cloud/*/querying':
    - 'salt://reactor/cloud_querying.sls'
  - 'salt/cloud/*/waiting_for_ssh':
    - 'salt://reactor/cloud_waiting_for_ssh.sls'
  - 'salt/cloud/*/deploying':
    - 'salt://reactor/cloud_deploying.sls'
  - 'salt/cloud/*/created':
    - 'salt://reactor/cloud_created.sls'
  - 'salt/cloud/*/destroying':
    - 'salt://reactor/cloud_destroying.sls'
  - 'salt/cloud/*/destroyed':
    - 'salt://reactor/cloud_destroyed.sls'

Yes, absolutely the master has been started many times since this file was modified. I'll try creating a VM outside of a reactor and see if I get the result.

ggiesen commented 3 years ago

Just to be clear, the VM is created correctly by the salt/netapi/hook/netbox/virtualization/virtual-machines reactor (triggered by a webhook from Netbox), it's just when it executes the code in salt://reactor/cloud_created.sls to gather things like the assigned MAC address (so that I can populate them back into Netbox), it circles around my loop for 20-30 mins, complaining it can't locate the VM.

ggiesen commented 3 years ago

If I run 'salt-cloud -a show_instance' on a VM that has been running for a while, it does indeed work. I'll try it on one that is newly-created and report back

ggiesen commented 3 years ago

I can also confirm on a newly-created VM, 'salt-cloud -a show_instance ` works. I created the VM, and ran it both before and after the salt/cloud/*/created reactor fired (while the reactor was circling through its loop a couple times and failing with the same error above). This time it only took less than 5 minutes for the reactor to find the VM, but it definitely had to iterate through the loop at least twice.

sagetherage commented 3 years ago

It does sound like there is a problem and we see that it can take longer than expected at times, and we are not sure if is something we can fix, but we will try. We will review this again in triage tmr and attempt to get a development instance of vcenter to test it - we are still learning how we gain access to an instance, so I apologize for the delay, and I have not forgotten! :)

wegenerbenjamin commented 3 years ago

We have the same problem. The following change (line 476 in salt/utils/vmware.py) works as a workaround for us:

- except vim.fault.NotAuthenticated:
+ except (vim.fault.NotAuthenticated, ssl.SSLError, BrokenPipeError):
vkotarov commented 1 year ago

Added ConnectionResetError to the list in order to address very rare cases where cloud failed. In my case I am executing cloud client from a runner and I get BrokenPipeError on almost every attempt.

a-wildman commented 10 months ago

Confirmed this is still an issue in 3006.3