saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Install Salt from the Salt package repositories here:
https://docs.saltproject.io/salt/install-guide/en/latest/
Apache License 2.0
14.2k stars 5.48k forks source link

[BUG] 3003.1 Duplicate Key in vsphere6 module when creating VM with many disks #60564

Open severson-globus opened 3 years ago

severson-globus commented 3 years ago

Description When creating a virtual machine using the vsphere6 module, the following error occurs during cloud.create:

A specified parameter was not correct: deviceChange[14].device.key (Cannot add multiple devices using the same device key..)

On subsequent deploys, the index changes to various different devices, or it may intermittently work and deploy a working VM.

Setup Example cloud.create runner:

salt.runner:
    - name: cloud.create
    - provider: vsphere6
    - instances: maindb3
    - host: esxi-ha-07
    - clonefrom: 2105-rhel7-fips
    - datastore: nvme0
    - num_cpus: 8
    - memory: 128GB
    - folder: /Datacenter/vm/SomeFolder
    - cluster: SomeCluster
    - ssh_username: root
    - devices:  
        scsi: 
          SCSI controller 1:
            type: paravirtual 
          SCSI controller 2:
            type: paravirtual 
          SCSI controller 3:
            type: paravirtual        
        disk:        
          Hard disk 05:
            size: 500
            thin_provision: True
            datastore: nvme0
            controller: SCSI controller 1        
          Hard disk 06:
            size: 500
            thin_provision: True
            datastore: nvme1
            controller: SCSI controller 1        
          Hard disk 07:
            size: 500
            thin_provision: True
            datastore: nvme0
            controller: SCSI controller 1        
          Hard disk 08:
            size: 500
            thin_provision: True
            datastore: nvme1
            controller: SCSI controller 1        
          Hard disk 09:
            size: 500
            thin_provision: True
            datastore: nvme0
            controller: SCSI controller 1        
          Hard disk 10:
            size: 500
            thin_provision: True
            datastore: nvme1
            controller: SCSI controller 1        
          Hard disk 11:
            size: 500
            thin_provision: True
            datastore: nvme0
            controller: SCSI controller 1        
          Hard disk 12:
            size: 500
            thin_provision: True
            datastore: nvme1
            controller: SCSI controller 1        
          Hard disk 13:
            size: 250
            thin_provision: True
            datastore: nvme0
            controller: SCSI controller 2        
          Hard disk 14:
            size: 250
            thin_provision: True
            datastore: nvme1
            controller: SCSI controller 2        
          Hard disk 15:
            size: 250
            thin_provision: True
            datastore: nvme0
            controller: SCSI controller 2        
          Hard disk 16:
            size: 250
            thin_provision: True
            datastore: nvme1
            controller: SCSI controller 2        
          Hard disk 17:
            size: 1000
            thin_provision: True
            datastore: nvme0
            controller: SCSI controller 3        
          Hard disk 18:
            size: 1000
            thin_provision: True
            datastore: nvme1
            controller: SCSI controller 3        
          Hard disk 19:
            size: 1000
            thin_provision: True
            datastore: nvme0
            controller: SCSI controller 3        
          Hard disk 20:
            size: 1000
            thin_provision: True
            datastore: nvme1
            controller: SCSI controller 3
        network:  
          Network adapter 1:
            name: Network
            adapter_type: vmxnet3
            switch_type: standard
            ip: 10.0.0.100
            gateway: 10.0.0.254
            subnet_mask: 255.255.255.0
            connected: True
            start_connected: True   
    - domain: contoso.local
    - extra_config:
        disk.EnableUUID: 'TRUE'
    - dns_servers:
      - 8.8.8.8
      - 8.8.4.4
    - minion:
        id: maindb3
        pillarenv_from_saltenv: True
        use_superseded:
          - module.run

    - script: bootstrap-salt
    - script_args: -x python3 stable 3003.1
    - deploy: True
    - customization: True
    - hardware_version: 15
    - image: rhel7_64Guest
    - template: False
    - power_on: True
    - annotation: Created by Salt-Cloud
    - display_ssh_output: false

Steps to Reproduce the behavior Run the above salt.runner, or create your own with many disks.

Expected behavior Virtual machine is created with all disks specified.

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.) ``` Salt Version: Salt: 3003.1 Dependency Versions: cffi: 1.12.3 cherrypy: Not Installed dateutil: Not Installed docker-py: Not Installed gitdb: 4.0.7 gitpython: 3.1.18 Jinja2: 2.11.2 libgit2: 1.1.0 M2Crypto: 0.35.2 Mako: Not Installed msgpack: 0.6.2 msgpack-pure: Not Installed mysql-python: Not Installed pycparser: 2.19 pycrypto: Not Installed pycryptodome: Not Installed pygit2: 1.6.1 Python: 3.6.8 (default, Aug 7 2019, 17:28:10) python-gnupg: Not Installed PyYAML: 5.3.1 PyZMQ: 17.0.0 smmap: 4.0.0 timelib: Not Installed Tornado: 4.5.3 ZMQ: 4.1.4 System Versions: dist: centos 7 Core locale: UTF-8 machine: x86_64 release: 3.10.0-1062.9.1.el7.x86_64 system: Linux version: CentOS Linux 7 Core ```

Additional context This appears to be a bug in the salt/cloud/clouds/vmware.py module. In it, instead of using code to create unique device keys for each disk, it uses the following:

def _add_new_hard_disk_helper(
    disk_label,
    size_gb,
    unit_number,
    controller_key=1000,
    thin_provision=False,
    eagerly_scrub=False,
    datastore=None,
    vm_name=None,
):
    random_key = randint(-2099, -2000)

By setting random_key in small int range, the liklihood of colliding IDs is increased. The code should be rewritten to keep track of ID's generated, and ensure that no two devices are assigned the same ID. This issue could impact ANY other device being added to a virtual machine, but is most likely with disks.

As a work around, I've inceased the random_key int range to :

    random_key = randint(-2999, -2000)

This significantly decreases the liklihood of a collision, but it's still possible. The real fix would be to rewrite the ID generation code to use a 'base ID' for each device type, then increment by 1 for each device added in that type.

welcome[bot] commented 3 years ago

Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey. Please be sure to review our Code of Conduct. Also, check out some of our community resources including:

There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar. If you have additional questions, email us at saltproject@vmware.com. We’re glad you’ve joined our community and look forward to doing awesome things with you!

sagetherage commented 3 years ago

@waynew is this part if the salt-extension, or no?

ssbn commented 3 years ago

Just bumping the status on this.

waynew commented 3 years ago

Eventually this functionality will be superseded by the extension module, will open an issue there to track it as well.