saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Install Salt from the Salt package repositories here:
https://docs.saltproject.io/salt/install-guide/en/latest/
Apache License 2.0
14.19k stars 5.48k forks source link

[BUG] Batch mode creates batches in a weird way #58246

Open Oloremo opened 4 years ago

Oloremo commented 4 years ago

Description

Salt has the ability to run something in batches. I noticed that:

  1. Minions split per batch is very weird
  2. using batch sometimes leads to an exception

Setup I have a cluster with 33 minions - an odd number of minions could be important here.

Seems like it's only reproducible while doing state.apply, running something like test.ping will work just fine.

The exception is the same as I reported here: https://github.com/saltstack/salt/issues/56273

Steps to Reproduce the behavior I run something like:

 salt -b 10 \* state.apply formula.java test=True

And get this: https://gist.github.com/Oloremo/d1af8fef0410b2979584afa427fca555 Note that I removed all state output since it's irrelevant here.

Expected behavior With 33 minions and batch 10 I'd expect to have 4 batches: 10, 10, 10, 3

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.) ``` Salt Version: Salt: 3000.3 Dependency Versions: cffi: Not Installed cherrypy: Not Installed dateutil: Not Installed docker-py: Not Installed gitdb: Not Installed gitpython: Not Installed Jinja2: 2.11.1 libgit2: Not Installed M2Crypto: Not Installed Mako: Not Installed msgpack-pure: Not Installed msgpack-python: 0.6.2 mysql-python: Not Installed pycparser: Not Installed pycrypto: 2.6.1 pycryptodome: Not Installed pygit2: Not Installed Python: 3.6.8 (default, Aug 7 2019, 17:28:10) python-gnupg: 0.4.5 PyYAML: 3.11 PyZMQ: 18.0.2 smmap: Not Installed timelib: Not Installed Tornado: 4.5.3 ZMQ: 4.3.1 System Versions: dist: centos 7.6.1810 Core locale: UTF-8 machine: x86_64 release: 3.10.0-957.1.3.el7.x86_64 system: Linux version: CentOS Linux 7.6.1810 Core```

Additional context Add any other context about the problem here.

Oloremo commented 4 years ago

Hm, it maybe works correctly just badly named. Should be something "max-concurrent"

I kinda guess it could work as "no more than N at once", so with batch 10 if 1 finishes and 9 are still running - salt goes ahead and launch one more minion since 9 < 10.

dwoz commented 4 years ago

@Oloremo I agree the name is not great but I doubt it could reasonably be changed now because of how long the feature has been around.

We should at least update some documentation to make sure this is more clear.