saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:
https://repo.saltproject.io/
Apache License 2.0
14.08k stars 5.47k forks source link

random_startup_delay affects all jobs, not just initial minion service startup #44609

Open cruscio opened 6 years ago

cruscio commented 6 years ago

Description of Issue/Question

Windows minion applies random_startup_delay configuration value to jobs executed by the master.

The maximum bound for an interval in which a minion will randomly sleep upon starting up prior to attempting to connect to a master. This can be used to splay connection attempts for cases where many minions starting up at once may place undue load on a master.

Setup

Minion config: log_level: debug log_level_logfile: debug random_startup_delay: 120

Steps to Reproduce Issue

Run salt '*' test.ping multiple times on the master

Actual Outcome:

Expected Outcome:

2017-11-20 13:16:40,861 [salt.minion      ][INFO    ][12236] User sudo_opsadmin Executing command test.ping with jid 20171120131640805325
2017-11-20 13:16:40,861 [salt.minion      ][DEBUG   ][12236] Command details {'tgt_type': 'glob', 'jid': '20171120131640805325', 'tgt': 'xMinion_IDx', 'ret': '', 'user': 'sudo_opsadmin', 'arg': [], 'fun': 'test.ping'}
2017-11-20 13:16:42,392 [salt.log.setup   ][DEBUG   ][9220] Multiprocessing queue logging configured for the process running under PID: 9220
2017-11-20 13:16:42,392 [salt.config      ][DEBUG   ][9220] Reading configuration from c:\salt\conf\minion
2017-11-20 13:16:42,456 [salt.config      ][DEBUG   ][9220] Including configuration from 'c:\salt\conf\minion.d\_schedule.conf'
2017-11-20 13:16:42,456 [salt.config      ][DEBUG   ][9220] Reading configuration from c:\salt\conf\minion.d\_schedule.conf
2017-11-20 13:16:42,456 [salt.config      ][DEBUG   ][9220] Including configuration from 'c:\salt\conf\minion.d\keepalive.conf'
2017-11-20 13:16:42,456 [salt.config      ][DEBUG   ][9220] Reading configuration from c:\salt\conf\minion.d\keepalive.conf
2017-11-20 13:16:42,456 [salt.config      ][DEBUG   ][9220] Including configuration from 'c:\salt\conf\minion.d\loglevel.conf'
2017-11-20 13:16:42,456 [salt.config      ][DEBUG   ][9220] Reading configuration from c:\salt\conf\minion.d\loglevel.conf
2017-11-20 13:16:42,456 [salt.config      ][DEBUG   ][9220] Including configuration from 'c:\salt\conf\minion.d\startupdelay.conf'
2017-11-20 13:16:42,456 [salt.config      ][DEBUG   ][9220] Reading configuration from c:\salt\conf\minion.d\startupdelay.conf
2017-11-20 13:16:42,456 [salt.config      ][DEBUG   ][9220] Including configuration from 'c:\salt\conf\minion.d\transport.conf'
2017-11-20 13:16:42,456 [salt.config      ][DEBUG   ][9220] Reading configuration from c:\salt\conf\minion.d\transport.conf
2017-11-20 13:16:44,517 [salt.minion      ][INFO    ][9220] Creating minion process manager
2017-11-20 13:16:44,517 [salt.minion      ][INFO    ][9220] Minion sleeping for 87 seconds due to configured startup_delay between 0 and 120 seconds
2017-11-20 13:16:45,878 [salt.minion      ][INFO    ][12236] User sudo_opsadmin Executing command saltutil.find_job with jid 20171120131645829791
2017-11-20 13:16:45,878 [salt.minion      ][DEBUG   ][12236] Command details {'tgt_type': 'list', 'jid': '20171120131645829791', 'tgt': ['xMinion_IDx'], 'ret': '', 'user': 'sudo_opsadmin', 'arg': ['20171120131640805325'], 'fun': 'saltutil.find_job'}
2017-11-20 13:16:47,253 [salt.log.setup   ][DEBUG   ][13936] Multiprocessing queue logging configured for the process running under PID: 13936
2017-11-20 13:16:47,253 [salt.config      ][DEBUG   ][13936] Reading configuration from c:\salt\conf\minion
2017-11-20 13:16:47,299 [salt.config      ][DEBUG   ][13936] Including configuration from 'c:\salt\conf\minion.d\_schedule.conf'
2017-11-20 13:16:47,299 [salt.config      ][DEBUG   ][13936] Reading configuration from c:\salt\conf\minion.d\_schedule.conf
2017-11-20 13:16:47,299 [salt.config      ][DEBUG   ][13936] Including configuration from 'c:\salt\conf\minion.d\keepalive.conf'
2017-11-20 13:16:47,299 [salt.config      ][DEBUG   ][13936] Reading configuration from c:\salt\conf\minion.d\keepalive.conf
2017-11-20 13:16:47,299 [salt.config      ][DEBUG   ][13936] Including configuration from 'c:\salt\conf\minion.d\loglevel.conf'
2017-11-20 13:16:47,299 [salt.config      ][DEBUG   ][13936] Reading configuration from c:\salt\conf\minion.d\loglevel.conf
2017-11-20 13:16:47,299 [salt.config      ][DEBUG   ][13936] Including configuration from 'c:\salt\conf\minion.d\startupdelay.conf'
2017-11-20 13:16:47,299 [salt.config      ][DEBUG   ][13936] Reading configuration from c:\salt\conf\minion.d\startupdelay.conf
2017-11-20 13:16:47,315 [salt.config      ][DEBUG   ][13936] Including configuration from 'c:\salt\conf\minion.d\transport.conf'
2017-11-20 13:16:47,315 [salt.config      ][DEBUG   ][13936] Reading configuration from c:\salt\conf\minion.d\transport.conf
2017-11-20 13:16:48,065 [salt.minion      ][INFO    ][13936] Creating minion process manager
2017-11-20 13:16:48,065 [salt.minion      ][INFO    ][13936] Minion sleeping for 71 seconds due to configured startup_delay between 0 and 120 seconds
2017-11-20 13:17:59,082 [salt.minion      ][INFO    ][13936] Starting a new job with PID 13936
2017-11-20 13:17:59,115 [salt.utils.lazy  ][DEBUG   ][13936] LazyLoaded saltutil.find_job
2017-11-20 13:17:59,115 [salt.utils.lazy  ][DEBUG   ][13936] LazyLoaded direct_call.get
2017-11-20 13:17:59,115 [salt.minion      ][DEBUG   ][13936] Minion return retry timer set to 7 seconds (randomized)
2017-11-20 13:17:59,115 [salt.minion      ][INFO    ][13936] Returning information for job: 20171120131645829791

Versions Report

Master

Salt Version:
           Salt: 2017.7.2

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 2.4.2
      docker-py: Not Installed
          gitdb: 0.6.4
      gitpython: 1.0.1
          ioflo: Not Installed
         Jinja2: 2.8
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: 1.0.3
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.12 (default, Nov 19 2016, 06:48:10)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.2.0
           RAET: Not Installed
          smmap: 0.9.0
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: Ubuntu 16.04 xenial
         locale: UTF-8
        machine: x86_64
        release: 4.4.0-98-generic
         system: Linux
        version: Ubuntu 16.04 xenial

Minion

    Salt Version:
               Salt: 2017.7.2

    Dependency Versions:
               cffi: 1.10.0
           cherrypy: 10.2.1
           dateutil: 2.6.0
          docker-py: Not Installed
              gitdb: 2.0.3
          gitpython: 2.1.3
              ioflo: Not Installed
             Jinja2: 2.9.6
            libgit2: Not Installed
            libnacl: Not Installed
           M2Crypto: Not Installed
               Mako: 1.0.6
       msgpack-pure: Not Installed
     msgpack-python: 0.4.8
       mysql-python: Not Installed
          pycparser: 2.17
           pycrypto: 2.6.1
       pycryptodome: Not Installed
             pygit2: Not Installed
             Python: 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 20:53:40) [MSC v.1500 64 bit (AMD64)]
       python-gnupg: 0.4.0
             PyYAML: 3.11
              PyZMQ: 16.0.2
               RAET: Not Installed
              smmap: 2.0.3
            timelib: 0.2.4
            Tornado: 4.5.1
                ZMQ: 4.1.6

    System Versions:
               dist:
             locale: cp1252
            machine: AMD64
            release: 2012ServerR2
             system: Windows
            version: 2012ServerR2 6.3.9600  Multiprocessor Free
gtmanfred commented 6 years ago

Are you seeing this with linux minions as well or only windows minions?

cruscio commented 6 years ago

Hey @gtmanfred - Just tested it on a linux minion and did not see the same behavior.

Minion

    Salt Version:
               Salt: 2017.7.1

    Dependency Versions:
               cffi: Not Installed
           cherrypy: Not Installed
           dateutil: 2.4.2
          docker-py: Not Installed
              gitdb: Not Installed
          gitpython: Not Installed
              ioflo: Not Installed
             Jinja2: 2.8
            libgit2: Not Installed
            libnacl: Not Installed
           M2Crypto: Not Installed
               Mako: 1.0.3
       msgpack-pure: Not Installed
     msgpack-python: 0.4.6
       mysql-python: Not Installed
          pycparser: Not Installed
           pycrypto: 2.6.1
       pycryptodome: Not Installed
             pygit2: Not Installed
             Python: 2.7.12 (default, Nov 19 2016, 06:48:10)
       python-gnupg: Not Installed
             PyYAML: 3.11
              PyZMQ: 15.2.0
               RAET: Not Installed
              smmap: Not Installed
            timelib: Not Installed
            Tornado: 4.2.1
                ZMQ: 4.1.4

    System Versions:
               dist: Ubuntu 16.04 xenial
             locale: UTF-8
            machine: x86_64
            release: 4.4.0-62-generic
             system: Linux
            version: Ubuntu 16.04 xenial
gtmanfred commented 6 years ago

Yes, the problem with windows is there is no equivalent to os.fork which linux uses to create the new process, which gives the exact same namespace from the point where it was forked.

For windows, since there is no memory forking equivalent, so we have to reinitialize the process each time a new windows minion process is create, and I think that wherever this is loaded, it is causing the random delay to happen.

@twangboy @UtahDave can one of yall take a look at this, and see if we could pass a variable to init on the Minion class to let it know that this is a "forked" process, and shouldn't do the sleep?

Thanks, Daniel

twangboy commented 6 years ago

@gtmanfred This may be a big part of the slow down we're seeing on Windows minions.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

stjeanp commented 5 years ago

This is still occurring on Windows hosts with the 2019.2 minion. It's impacting our workflows. Can it get a little love?

doesitblend commented 5 years ago

@twangboy Does this need some further work to get your PR merged in?

stale[bot] commented 5 years ago

Thank you for updating this issue. It is no longer marked as stale.

stale[bot] commented 5 years ago

Thank you for updating this issue. It is no longer marked as stale.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

stale[bot] commented 4 years ago

Thank you for updating this issue. It is no longer marked as stale.

amalaguti commented 1 year ago

Hi, this is still an issue (at least on Windows), 3004.2.

2022-10-13 19:07:39,285 [salt.minion :1296][INFO ][4676] Minion sleeping for 19 seconds due to configured startup_delay between 0 and 20 seconds 2022-10-13 19:07:44,371 [salt.minion :1296][INFO ][4704] Minion sleeping for 14 seconds due to configured startup_delay between 0 and 20 seconds

This one should get fixed, causing problems with minion responses