saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:
https://repo.saltproject.io/
Apache License 2.0
14.03k stars 5.47k forks source link

Salt-cloud API completely broken and nonfunctional sometime between April and now/after migrating to python 3 #53567

Open fake-name opened 5 years ago

fake-name commented 5 years ago

Description of Issue

I'm using the salt.cloud.CloudClient() interface to manage salt minions.

Currently, every invocation of CloudClient().<function> by my code now fails.

Traceback (most recent call last):
  File "salt_scheduler.py", line 326, in <module>
    refresh()
  File "salt_scheduler.py", line 259, in refresh
    sched.destroy_vm(vm_name)
  File "salt_scheduler.py", line 140, in destroy_vm
    self.interface.destroy_client(vm_name)
  File "/home/client/AutoTriever/marshaller/salt_runner.py", line 659, in destroy_client
    ret = self.cc.destroy([clientname])
  File "/usr/local/lib/python3.6/dist-packages/salt/cloud/__init__.py", line 390, in destroy
    mapper.destroy(names)
  File "/usr/local/lib/python3.6/dist-packages/salt/cloud/__init__.py", line 1023, in destroy
    ret = self.clouds[fun](name)
  File "/usr/local/lib/python3.6/dist-packages/salt/cloud/clouds/linode.py", line 742, in destroy
    transport=__opts__['transport']
  File "/usr/local/lib/python3.6/dist-packages/salt/utils/cloud.py", line 2016, in fire_event
    time.sleep(0.025)
  File "/usr/local/lib/python3.6/dist-packages/salt/utils/event.py", line 905, in __exit__
    self.destroy()
  File "/usr/local/lib/python3.6/dist-packages/salt/utils/event.py", line 790, in destroy
    self.io_loop.close()
  File "/usr/lib/python3/dist-packages/tornado/ioloop.py", line 716, in close
    self.remove_handler(self._waker.fileno())
  File "/usr/lib/python3/dist-packages/tornado/platform/posix.py", line 48, in fileno
    return self.reader.fileno()
ValueError: I/O operation on closed file
Traceback (most recent call last):
  File "salt_runner.py", line 930, in <module>
    go()
  File "salt_runner.py", line 922, in go
    fmap[command]()
  File "salt_runner.py", line 753, in dtest
    ret = herder.cc.create(names=[clientname], provider=provider, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/salt/cloud/__init__.py", line 429, in create
    mapper.create(vm_))
  File "/usr/local/lib/python3.6/dist-packages/salt/cloud/__init__.py", line 1261, in create
    output = self.clouds[func](vm_)
  File "/usr/local/lib/python3.6/dist-packages/salt/cloud/clouds/digitalocean.py", line 293, in create
    transport=__opts__['transport']
  File "/usr/local/lib/python3.6/dist-packages/salt/utils/cloud.py", line 2016, in fire_event
    time.sleep(0.025)
  File "/usr/local/lib/python3.6/dist-packages/salt/utils/event.py", line 905, in __exit__
    self.destroy()
  File "/usr/local/lib/python3.6/dist-packages/salt/utils/event.py", line 790, in destroy
    self.io_loop.close()
  File "/usr/lib/python3/dist-packages/tornado/ioloop.py", line 716, in close
    self.remove_handler(self._waker.fileno())
  File "/usr/lib/python3/dist-packages/tornado/platform/posix.py", line 48, in fileno
    return self.reader.fileno()
ValueError: I/O operation on closed file

It looks, at a quick check, like something somewhere added a dependency on a tornado ioloop without considering how this would affect consumers of the various APIs.

I last merged upstream on April 23, though I'm currently running salt from the official apt repo.

Note that this may also be a python2/python3 issue, as I both updated salt and switched from python 2 to python 3.

Setup

https://github.com/fake-name/AutoTriever/tree/master/marshaller

Steps to Reproduce Issue

Execute salt_runner.py ltest from https://github.com/fake-name/AutoTriever/tree/master/marshaller

Note: this will create a linode VM.

You can also do salt_runner.py dtest (digital ocean), salt_runner.py vtest (vultr) and a set of others. See the output from salt_runner.py without args.

Versions Report

Salt Version:
           Salt: 2016.11.0-255-g4de7a0d

Dependency Versions:
           cffi: Not Installed
       cherrypy: unknown
       dateutil: 2.6.1
      docker-py: Not Installed
          gitdb: 2.0.3
      gitpython: 2.1.8
         Jinja2: 2.10
        libgit2: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 3.6.8 (default, Jan 14 2019, 11:02:34)
   python-gnupg: 0.4.1
         PyYAML: 3.12
          PyZMQ: 16.0.2
          smmap: 2.0.3
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.2.5

System Versions:
           dist: Ubuntu 18.04 bionic
         locale: UTF-8
        machine: x86_64
        release: 3.14.32-xxxx-grs-ipv6-64
         system: Linux
        version: Ubuntu 18.04 bionic
fake-name commented 5 years ago

Going into /usr/local/lib/python3.6/dist-packages/salt/utils/event.py and commenting out self.io_loop.close in the SaltEvent.destroy() function appears to at least allow me to create VMs again.

I have no idea what this will wind up breaking, but from what I can tell, the tornado.ioloop.IOLoop() returns a singleton, so we don't want to close the ioloop anyways, since I'm running a persistent application that periodically runs setup/teardown tasks via apscheduler.


Further reading makes me think passing keep_loop=True to salt.utils.event.get_event() in fire_event() in salt/utils/cloud.py is probably the correct solution.

In any event, it makes thins work, at least.

Akm0d commented 5 years ago

Thanks for reporting the issue. Team, we need to find out if this happens in later versions of salt

fake-name commented 5 years ago

I'm basically running develop, so it currently happens there.

fake-name commented 5 years ago

Note that I've been running with https://github.com/fake-name/salt/commit/60736621e9fcf7ee3a150238d5de19b1fab5be16 for the last week (since the issue was opened, basically), and it's been apparently OK (though I deliberately don't use a lot of salt's features).

I both updated to the current develop, AND switched from python 2 to python 3 (which was kind of dumb to do together, in retrospect), so I'm not sure if this is a problem triggered by the py3k change or the recent develop.

Akm0d commented 5 years ago

Would you mind submitting a Pull Request for https://github.com/fake-name/salt/commit/60736621e9fcf7ee3a150238d5de19b1fab5be16?

fake-name commented 5 years ago

Sure.

Mostly, I figured I should check if there was a reason the reactor was getting torn down the way it was before just blindly PRing changes.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

stale[bot] commented 4 years ago

Thank you for updating this issue. It is no longer marked as stale.