saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:
https://repo.saltproject.io/
Apache License 2.0
13.99k stars 5.47k forks source link

virt.init fails to seed under certain conditions #55348

Open tash opened 4 years ago

tash commented 4 years ago

Description of Issue

When creating a VM with virt.init, the seeding fails on the given system:

salt-call -l debug virt.init gitlab 2 4096 start=False image=/central/vm/test/openSUSE-Leap-15.1-JeOS.x86_64-15.1.0-kvm-and-xen-Snapshot9.114.qcow2

Logging:

[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG   ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG   ] Including configuration from '/etc/salt/minion.d/testdefault.conf'
[DEBUG   ] Reading configuration from /etc/salt/minion.d/testdefault.conf
[DEBUG   ] Using cached minion ID from /etc/salt/minion_id: test02
[DEBUG   ] Configuration file path: /etc/salt/minion
[WARNING ] Insecure logging configuration detected! Sensitive data may be logged.
[DEBUG   ] Grains refresh requested. Refreshing grains.
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG   ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG   ] Including configuration from '/etc/salt/minion.d/testdefault.conf'
[DEBUG   ] Reading configuration from /etc/salt/minion.d/testdefault.conf
[DEBUG   ] Connecting to master. Attempt 1 of 1
[DEBUG   ] "salt-master" Not an IP address? Assuming it is a hostname.
[DEBUG   ] Master URI: tcp://10.10.81.124:4506
[DEBUG   ] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'test02', u'tcp://10.10.81.124:4506')
[DEBUG   ] Generated random reconnect delay between '1000ms' and '11000ms' (7065)
[DEBUG   ] Setting zmq_reconnect_ivl to '7065ms'
[DEBUG   ] Setting zmq_reconnect_ivl_max to '11000ms'
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'test02', u'tcp://10.10.81.124:4506', 'clear')
[DEBUG   ] Connecting the Minion to the Master URI (for the return server): tcp://10.10.81.124:4506
[DEBUG   ] Trying to connect to: tcp://10.10.81.124:4506
[DEBUG   ] salt.crypt.get_rsa_pub_key: Loading public key
[DEBUG   ] Decrypting the current master AES key
[DEBUG   ] salt.crypt.get_rsa_key: Loading private key
[DEBUG   ] salt.crypt._get_key_with_evict: Loading private key
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] salt.crypt.get_rsa_pub_key: Loading public key
[DEBUG   ] Connecting the Minion to the Master publish port, using the URI: tcp://10.10.81.124:4505
[DEBUG   ] salt.crypt.get_rsa_key: Loading private key
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] Determining pillar cache
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'test02', u'tcp://10.10.81.124:4506', u'aes')
[DEBUG   ] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'test02', u'tcp://10.10.81.124:4506')
[DEBUG   ] Connecting the Minion to the Master URI (for the return server): tcp://10.10.81.124:4506
[DEBUG   ] Trying to connect to: tcp://10.10.81.124:4506
[DEBUG   ] salt.crypt.get_rsa_key: Loading private key
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] Determining pillar cache
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'test02', u'tcp://10.10.81.124:4506', u'aes')
[DEBUG   ] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'test02', u'tcp://10.10.81.124:4506')
[DEBUG   ] Connecting the Minion to the Master URI (for the return server): tcp://10.10.81.124:4506
[DEBUG   ] Trying to connect to: tcp://10.10.81.124:4506
[DEBUG   ] salt.crypt.get_rsa_key: Loading private key
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] LazyLoaded jinja.render
[DEBUG   ] LazyLoaded yaml.render
[DEBUG   ] LazyLoaded virt.init
[DEBUG   ] LazyLoaded config.get
[DEBUG   ] Using hyperisor kvm
[DEBUG   ] NIC profile is [{u'mac': u'52:54:00:9F:BB:6D', u'source': u'virbr0', u'model': u'virtio', u'type': u'bridge', u'name': u'eth0'}]
[DEBUG   ] /central/vm/test/openSUSE-Leap-15.1-JeOS.x86_64-15.1.0-kvm-and-xen-Snapshot9.114.qcow2 image from module arguments will be used for disk "system" instead of None
[DEBUG   ] Creating disk for VM [ gitlab ]: {u'system': {u'model': u'virtio', u'format': u'qcow2', u'image': u'/central/vm/test/openSUSE-Leap-15.1-JeOS.x86_64-15.1.0-kvm-and-xen-Snapshot9.114.qcow2', u'pool': u'/nas-01/images', u'siz$
': u'8192'}}
[DEBUG   ] Image directory from config option `virt.images` is /nas-01/images
[DEBUG   ] Image destination will be /nas-01/images/gitlab/system.qcow2
[DEBUG   ] Image destination directory is /nas-01/images/gitlab
[DEBUG   ] Create disk from specified image /central/vm/test/openSUSE-Leap-15.1-JeOS.x86_64-15.1.0-kvm-and-xen-Snapshot9.114.qcow2
[DEBUG   ] LazyLoaded cp.cache_file
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'test02', u'tcp://10.10.81.124:4506', u'aes')
[DEBUG   ] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'test02', u'tcp://10.10.81.124:4506')
[DEBUG   ] Connecting the Minion to the Master URI (for the return server): tcp://10.10.81.124:4506
[DEBUG   ] Trying to connect to: tcp://10.10.81.124:4506
[DEBUG   ] LazyLoaded cmd.run
[INFO    ] Executing command 'qemu-img info /central/vm/test/openSUSE-Leap-15.1-JeOS.x86_64-15.1.0-kvm-and-xen-Snapshot9.114.qcow2' in directory '/root'
[DEBUG   ] stdout: image: /central/vm/test/openSUSE-Leap-15.1-JeOS.x86_64-15.1.0-kvm-and-xen-Snapshot9.114.qcow2
file format: qcow2
virtual size: 24G (25769803776 bytes)
disk size: 231M
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
[DEBUG   ] output: image: /central/vm/test/openSUSE-Leap-15.1-JeOS.x86_64-15.1.0-kvm-and-xen-Snapshot9.114.qcow2
file format: qcow2
virtual size: 24G (25769803776 bytes)
disk size: 231M
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
[DEBUG   ] Copying /central/vm/test/openSUSE-Leap-15.1-JeOS.x86_64-15.1.0-kvm-and-xen-Snapshot9.114.qcow2 to /nas-01/images/gitlab/system.qcow2
[DEBUG   ] Resize qcow2 image to 8192M
[INFO    ] Executing command 'qemu-img resize /nas-01/images/gitlab/system.qcow2 8192M' in directory '/root'
[ERROR   ] Command '[u'qemu-img', u'resize', u'/nas-01/images/gitlab/system.qcow2', u'8192M']' failed with return code: 1
[ERROR   ] stdout: qemu-img: qcow2 doesn't support shrinking images yet
[ERROR   ] retcode: 1
[ERROR   ] Command 'qemu-img resize /nas-01/images/gitlab/system.qcow2 8192M' failed with return code: 1
[ERROR   ] output: qemu-img: qcow2 doesn't support shrinking images yet
[DEBUG   ] Apply umask and remove exec bit
[DEBUG   ] Seed command is seed.apply
[DEBUG   ] LazyLoaded seed.apply
[DEBUG   ] LazyLoaded file.stats
[DEBUG   ] Mounting file at /nas-01/images/gitlab/system.qcow2
[DEBUG   ] LazyLoaded guestfs.mount
[DEBUG   ] LazyLoaded mount.mount
[DEBUG   ]  ---> location: /nas-01/images/gitlab/system.qcow2
[DEBUG   ]  ---> root: None
[DEBUG   ] Using root /tmp/guest/nas-01.images.gitlab.system.qcow2
[DEBUG   ] Establishing new root as /tmp/guest/nas-01.images.gitlab.system.qcow22c7c8055ce08653ebb4e59f77263bc2e309f8273a02bcb7d383c809a4f609b40
[DEBUG   ]  ---> cmd: guestmount -i -a /nas-01/images/gitlab/system.qcow2 --rw /tmp/guest/nas-01.images.gitlab.system.qcow22c7c8055ce08653ebb4e59f77263bc2e309f8273a02bcb7d383c809a4f609b40
[INFO    ] Executing command 'guestmount -i -a /nas-01/images/gitlab/system.qcow2 --rw /tmp/guest/nas-01.images.gitlab.system.qcow22c7c8055ce08653ebb4e59f77263bc2e309f8273a02bcb7d383c809a4f609b40' in directory '/root'
[DEBUG   ] output:
[DEBUG   ] Attempting to create directory /tmp/guest/nas-01.images.gitlab.system.qcow22c7c8055ce08653ebb4e59f77263bc2e309f8273a02bcb7d383c809a4f609b40/tmp
[DEBUG   ] LazyLoaded pillar.ext
[DEBUG   ] Determining pillar cache
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'test02', u'tcp://10.10.81.124:4506', u'aes')
[DEBUG   ] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'test02', u'tcp://10.10.81.124:4506')
[DEBUG   ] Connecting the Minion to the Master URI (for the return server): tcp://10.10.81.124:4506
[DEBUG   ] Trying to connect to: tcp://10.10.81.124:4506
[DEBUG   ] salt.crypt.get_rsa_key: Loading private key
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[INFO    ] salt-minion pre-installed on image, configuring as gitlab
[DEBUG   ] Reading configuration from /tmp/guest/nas-01.images.gitlab.system.qcow22c7c8055ce08653ebb4e59f77263bc2e309f8273a02bcb7d383c809a4f609b40/tmp/minion
[ERROR   ] An un-handled exception was caught by salt's global exception handler:
OSError: [Errno 18] Invalid cross-device link
Traceback (most recent call last):
  File "/usr/bin/salt-call", line 11, in <module>
    salt_call()
  File "/usr/lib/python2.7/site-packages/salt/scripts.py", line 410, in salt_call
    client.run()
  File "/usr/lib/python2.7/site-packages/salt/cli/call.py", line 57, in run
    caller.run()
  File "/usr/lib/python2.7/site-packages/salt/cli/caller.py", line 134, in run
    ret = self.call()
  File "/usr/lib/python2.7/site-packages/salt/cli/caller.py", line 212, in call
    ret['return'] = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/salt/modules/virt.py", line 773, in init
    priv_key=priv_key,
  File "/usr/lib/python2.7/site-packages/salt/modules/seed.py", line 169, in apply_
    mpt, pki_dir.lstrip('/'), 'minion.pem'))
Traceback (most recent call last):
  File "/usr/bin/salt-call", line 11, in <module>
    salt_call()
  File "/usr/lib/python2.7/site-packages/salt/scripts.py", line 410, in salt_call
    client.run()
  File "/usr/lib/python2.7/site-packages/salt/cli/call.py", line 57, in run
    caller.run()
  File "/usr/lib/python2.7/site-packages/salt/cli/caller.py", line 134, in run
    ret = self.call()
  File "/usr/lib/python2.7/site-packages/salt/cli/caller.py", line 212, in call
    ret['return'] = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/salt/modules/virt.py", line 773, in init
    priv_key=priv_key,
  File "/usr/lib/python2.7/site-packages/salt/modules/seed.py", line 169, in apply_
    mpt, pki_dir.lstrip('/'), 'minion.pem'))
OSError: [Errno 18] Invalid cross-device link

Setup

libvirt-bash-completion-4.5.0-10.el7_6.2.x86_64 libvirt-client-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-nwfilter-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-lxc-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-storage-disk-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-storage-4.5.0-10.el7_6.2.x86_64 libvirt-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-kvm-4.5.0-10.el7_6.2.x86_64 libvirt-libs-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-network-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-config-nwfilter-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-qemu-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-storage-scsi-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-storage-logical-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-storage-iscsi-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-secret-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-nodedev-4.5.0-10.el7_6.2.x86_64 libvirt-glib-1.0.0-1.el7.x86_64 libvirt-python-4.5.0-1.el7.x86_64 libvirt-daemon-driver-storage-core-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-config-network-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-storage-mpath-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-storage-rbd-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-interface-4.5.0-10.el7_6.2.x86_64

qemu-kvm-common-ev-2.10.0-21.el7_5.7.1.x86_64 qemu-kvm-tools-ev-2.10.0-21.el7_5.7.1.x86_64 libvirt-daemon-kvm-4.5.0-10.el7_6.2.x86_64 libvirt-daemon-driver-qemu-4.5.0-10.el7_6.2.x86_64 qemu-kvm-ev-2.10.0-21.el7_5.7.1.x86_64 qemu-img-ev-2.10.0-21.el7_5.7.1.x86_64 ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch

libguestfs-1.38.2-12.el7.x86_64 libguestfs-tools-c-1.38.2-12.el7.x86_64 libguestfs-tools-1.38.2-12.el7.noarch perl-Sys-Guestfs-1.38.2-12.el7.x86_64 libguestfs-bash-completion-1.38.2-12.el7.noarch

Steps to Reproduce Issue

Call is made locally on the Hypervisor (test02): salt-call -l debug virt.init gitlab 2 4096 start=False image=/central/vm/test/openSUSE-Leap-15.1-JeOS.x86_64-15.1.0-kvm-and-xen-Snapshot9.114.qcow2

Versions Report

Salt Version:
           Salt: 2018.3.4

Dependency Versions:
           cffi: 1.6.0
       cherrypy: Not Installed
       dateutil: 1.5
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: Not Installed
      pycparser: 2.14
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.5 (default, Oct 30 2018, 23:45:53)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: centos 7.5.1804 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-957.el7.x86_64
         system: Linux
        version: CentOS Linux 7.5.1804 Core
tash commented 4 years ago

modules/seed.py _check_install calls os.rename which won't work under certain circumstances when source and destination are on different filesystems: https://github.com/saltstack/salt/blob/d0cad3e5a35ba80f18413820227dea4f3d6234d3/salt/modules/seed.py#L168-L173 A fix should be fairly easy:

        shutil.move(cfg_files['privkey'], os.path.join(
            mpt, pki_dir.lstrip('/'), 'minion.pem'))
        shutil.move(cfg_files['pubkey'], os.path.join(
            mpt, pki_dir.lstrip('/'), 'minion.pub'))
        shutil.move(cfg_files['config'], os.path.join(mpt, 'etc/salt/minion'))
        res = True
waynew commented 4 years ago

@tash Thanks for the report!

Looks like a pretty reasonable fix, too. Are you interested in writing the tests/fix for this? If not, that's totally fine.

For whoever does address this:

There's some code already in tests/unit/modules/test_seed.py that could either be modified to apply to this case, or used as a starting point here. In an ideal world we could write functional tests that would actually test the movement, but I'm pretty sure our existing CI pipeline doesn't have multiple mounts to break with the cross-device link.

I think it would be reasonable to just mock os.rename and have it raise OSerror (just to make sure it's not called).