Closed ssoto2 closed 4 years ago
3 minions
real 2m19.597s
user 0m1.920s
sys 0m0.142s
What about on the minion side? What do the debug logs show there?
Can you verify outside of salt that there wasn't any network slowdown possibly in your environment?
Also what does the CPU/Memory Usage look like on the master when this occurs?
Regarding the the CPU/MEM running salt now pings the processors and memory will go from ~600 megs to around 1.74GB when running this statement.. See the picture included (This is while running the state listed above)
Additional Information: This server is only running the salt master it is not being used for anything else.
@Ch3LL the network performance has never been a problem. The vast majority of the instances are in AWS in the same region and VPC. All the instances in the VPC have high to 10GB network capability on them. We have been using Salt in amazon for some time and no issues till about 1.5 month ago.. here is a iperf for documentation
MINION
iperf -c 10.0.8.18 -p 55555 -t 30 -P 8
------------------------------------------------------------
Client connecting to 10.0.8.18, TCP port 55555
TCP window size: 325 KByte (default)
------------------------------------------------------------
[ 10] local 10.0.2.83 port 57256 connected with 10.0.8.18 port 55555
[ 4] local 10.0.2.83 port 57242 connected with 10.0.8.18 port 55555
[ 5] local 10.0.2.83 port 57246 connected with 10.0.8.18 port 55555
[ 3] local 10.0.2.83 port 57244 connected with 10.0.8.18 port 55555
[ 7] local 10.0.2.83 port 57250 connected with 10.0.8.18 port 55555
[ 6] local 10.0.2.83 port 57248 connected with 10.0.8.18 port 55555
[ 8] local 10.0.2.83 port 57252 connected with 10.0.8.18 port 55555
[ 9] local 10.0.2.83 port 57254 connected with 10.0.8.18 port 55555
[ ID] Interval Transfer Bandwidth
[ 10] 0.0-30.0 sec 167 MBytes 46.6 Mbits/sec
[ 8] 0.0-30.0 sec 202 MBytes 56.3 Mbits/sec
[ 4] 0.0-30.0 sec 187 MBytes 52.2 Mbits/sec
[ 3] 0.0-30.0 sec 274 MBytes 76.6 Mbits/sec
[ 7] 0.0-30.0 sec 212 MBytes 59.2 Mbits/sec
[ 9] 0.0-30.0 sec 308 MBytes 86.0 Mbits/sec
[ 6] 0.0-30.0 sec 196 MBytes 54.6 Mbits/sec
[ 5] 0.0-30.0 sec 203 MBytes 56.6 Mbits/sec
[SUM] 0.0-30.0 sec 1.71 GBytes 488 Mbits/sec
MASTER
iperf -s -p 55555
------------------------------------------------------------
Server listening on TCP port 55555
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 10.0.8.18 port 55555 connected with 10.0.2.83 port 57244
[ 5] local 10.0.8.18 port 55555 connected with 10.0.2.83 port 57246
[ 7] local 10.0.8.18 port 55555 connected with 10.0.2.83 port 57248
[ 6] local 10.0.8.18 port 55555 connected with 10.0.2.83 port 57242
[ 8] local 10.0.8.18 port 55555 connected with 10.0.2.83 port 57250
[ 9] local 10.0.8.18 port 55555 connected with 10.0.2.83 port 57252
[ 10] local 10.0.8.18 port 55555 connected with 10.0.2.83 port 57254
[ 11] local 10.0.8.18 port 55555 connected with 10.0.2.83 port 57256
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-30.1 sec 274 MBytes 76.4 Mbits/sec
[ 9] 0.0-30.1 sec 202 MBytes 56.2 Mbits/sec
[ 5] 0.0-30.1 sec 203 MBytes 56.5 Mbits/sec
[ 6] 0.0-30.1 sec 187 MBytes 52.0 Mbits/sec
[ 11] 0.0-30.1 sec 167 MBytes 46.4 Mbits/sec
[ 8] 0.0-30.1 sec 212 MBytes 59.0 Mbits/sec
[ 10] 0.0-30.1 sec 308 MBytes 85.7 Mbits/sec
[ 7] 0.0-30.2 sec 196 MBytes 54.4 Mbits/sec
[SUM] 0.0-30.2 sec 1.71 GBytes 486 Mbits/sec
In Regards to your other question about the minion debugging. Here is the Debug logs from the minion. Here is the log from the minion listed in the above in debug mode:
2017-10-05 08:51:40,503 [salt.minion ][INFO ][2945] User sudo_ssoto Executing command state.sls with jid 20171005085143518712
2017-10-05 08:51:40,503 [salt.minion ][DEBUG ][2945] Command details {'tgt_type': 'glob', 'jid': '20171005085143518712', 'tgt': 'AMZ.ENG_Docker_02.i-598f95c9', 'ret': '', 'user': 'sudo_ssoto', 'arg': ['ts.deploy.telesto.CVE-2017-1000253', {'test': True, '__kwarg__': True}], 'fun': 'state.sls'}
2017-10-05 08:51:40,511 [salt.minion ][INFO ][3092] Starting a new job with PID 3092
2017-10-05 08:51:40,517 [salt.utils.lazy ][DEBUG ][3092] LazyLoaded state.sls
2017-10-05 08:51:40,519 [salt.utils.lazy ][DEBUG ][3092] LazyLoaded direct_call.get
2017-10-05 08:51:40,528 [salt.utils.lazy ][DEBUG ][3092] LazyLoaded saltutil.is_running
2017-10-05 08:51:40,529 [salt.utils.lazy ][DEBUG ][3092] LazyLoaded grains.get
2017-10-05 08:51:40,535 [salt.transport.zeromq][DEBUG ][3092] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506', 'aes')
2017-10-05 08:51:40,535 [salt.crypt ][DEBUG ][3092] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506')
2017-10-05 08:51:45,520 [salt.minion ][INFO ][2945] User sudo_ssoto Executing command saltutil.find_job with jid 20171005085148532424
2017-10-05 08:51:45,521 [salt.minion ][DEBUG ][2945] Command details {'tgt_type': 'list', 'jid': '20171005085148532424', 'tgt': ['AMZ.ENG_Docker_02.i-598f95c9'], 'ret': '', 'user': 'sudo_ssoto', 'arg': ['20171005085143518712'], 'fun': 'saltutil.find_job'}
2017-10-05 08:51:45,528 [salt.minion ][INFO ][3098] Starting a new job with PID 3098
2017-10-05 08:51:45,541 [salt.utils.lazy ][DEBUG ][3098] LazyLoaded saltutil.find_job
2017-10-05 08:51:45,543 [salt.utils.lazy ][DEBUG ][3098] LazyLoaded direct_call.get
2017-10-05 08:51:45,544 [salt.minion ][DEBUG ][3098] Minion return retry timer set to 5 seconds (randomized)
2017-10-05 08:51:45,544 [salt.minion ][INFO ][3098] Returning information for job: 20171005085148532424
2017-10-05 08:51:45,545 [salt.transport.zeromq][DEBUG ][3098] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506', 'aes')
2017-10-05 08:51:45,545 [salt.crypt ][DEBUG ][3098] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506')
2017-10-05 08:51:45,554 [salt.minion ][DEBUG ][3098] minion return: {'fun_args': ['20171005085143518712'], 'jid': '20171005085148532424', 'return': {'tgt_type': 'glob', 'jid': '20171005085143518712', 'tgt': 'AMZ.ENG_Docker_02.i-598f95c9', 'pid': 3092, 'ret': '', 'user': 'sudo_ssoto', 'arg': ['ts.deploy.telesto.CVE-2017-1000253', {'test': True, '__kwarg__': True}], 'fun': 'state.sls'}, 'retcode': 0, 'success': True, 'fun': 'saltutil.find_job'}
2017-10-05 08:51:55,615 [salt.minion ][INFO ][2945] User sudo_ssoto Executing command saltutil.find_job with jid 20171005085158620608
2017-10-05 08:51:55,615 [salt.minion ][DEBUG ][2945] Command details {'tgt_type': 'list', 'jid': '20171005085158620608', 'tgt': ['AMZ.ENG_Docker_02.i-598f95c9'], 'ret': '', 'user': 'sudo_ssoto', 'arg': ['20171005085143518712'], 'fun': 'saltutil.find_job'}
2017-10-05 08:51:55,623 [salt.minion ][INFO ][3103] Starting a new job with PID 3103
2017-10-05 08:51:55,637 [salt.utils.lazy ][DEBUG ][3103] LazyLoaded saltutil.find_job
2017-10-05 08:51:55,638 [salt.utils.lazy ][DEBUG ][3103] LazyLoaded direct_call.get
2017-10-05 08:51:55,639 [salt.minion ][DEBUG ][3103] Minion return retry timer set to 9 seconds (randomized)
2017-10-05 08:51:55,639 [salt.minion ][INFO ][3103] Returning information for job: 20171005085158620608
2017-10-05 08:51:55,640 [salt.transport.zeromq][DEBUG ][3103] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506', 'aes')
2017-10-05 08:51:55,640 [salt.crypt ][DEBUG ][3103] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506')
2017-10-05 08:51:55,652 [salt.minion ][DEBUG ][3103] minion return: {'fun_args': ['20171005085143518712'], 'jid': '20171005085158620608', 'return': {'tgt_type': 'glob', 'jid': '20171005085143518712', 'tgt': 'AMZ.ENG_Docker_02.i-598f95c9', 'pid': 3092, 'ret': '', 'user': 'sudo_ssoto', 'arg': ['ts.deploy.telesto.CVE-2017-1000253', {'test': True, '__kwarg__': True}], 'fun': 'state.sls'}, 'retcode': 0, 'success': True, 'fun': 'saltutil.find_job'}
2017-10-05 08:52:05,678 [salt.minion ][INFO ][2945] User sudo_ssoto Executing command saltutil.find_job with jid 20171005085208682450
2017-10-05 08:52:05,679 [salt.minion ][DEBUG ][2945] Command details {'tgt_type': 'list', 'jid': '20171005085208682450', 'tgt': ['AMZ.ENG_Docker_02.i-598f95c9'], 'ret': '', 'user': 'sudo_ssoto', 'arg': ['20171005085143518712'], 'fun': 'saltutil.find_job'}
2017-10-05 08:52:05,687 [salt.minion ][INFO ][3111] Starting a new job with PID 3111
2017-10-05 08:52:05,700 [salt.utils.lazy ][DEBUG ][3111] LazyLoaded saltutil.find_job
2017-10-05 08:52:05,701 [salt.utils.lazy ][DEBUG ][3111] LazyLoaded direct_call.get
2017-10-05 08:52:05,702 [salt.minion ][DEBUG ][3111] Minion return retry timer set to 9 seconds (randomized)
2017-10-05 08:52:05,702 [salt.minion ][INFO ][3111] Returning information for job: 20171005085208682450
2017-10-05 08:52:05,703 [salt.transport.zeromq][DEBUG ][3111] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506', 'aes')
2017-10-05 08:52:05,703 [salt.crypt ][DEBUG ][3111] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506')
2017-10-05 08:52:05,718 [salt.minion ][DEBUG ][3111] minion return: {'fun_args': ['20171005085143518712'], 'jid': '20171005085208682450', 'return': {'tgt_type': 'glob', 'jid': '20171005085143518712', 'tgt': 'AMZ.ENG_Docker_02.i-598f95c9', 'pid': 3092, 'ret': '', 'user': 'sudo_ssoto', 'arg': ['ts.deploy.telesto.CVE-2017-1000253', {'test': True, '__kwarg__': True}], 'fun': 'state.sls'}, 'retcode': 0, 'success': True, 'fun': 'saltutil.find_job'}
2017-10-05 08:52:15,794 [salt.minion ][INFO ][2945] User sudo_ssoto Executing command saltutil.find_job with jid 20171005085218806107
2017-10-05 08:52:15,795 [salt.minion ][DEBUG ][2945] Command details {'tgt_type': 'list', 'jid': '20171005085218806107', 'tgt': ['AMZ.ENG_Docker_02.i-598f95c9'], 'ret': '', 'user': 'sudo_ssoto', 'arg': ['20171005085143518712'], 'fun': 'saltutil.find_job'}
2017-10-05 08:52:15,802 [salt.minion ][INFO ][3116] Starting a new job with PID 3116
2017-10-05 08:52:15,816 [salt.utils.lazy ][DEBUG ][3116] LazyLoaded saltutil.find_job
2017-10-05 08:52:15,817 [salt.utils.lazy ][DEBUG ][3116] LazyLoaded direct_call.get
2017-10-05 08:52:15,818 [salt.minion ][DEBUG ][3116] Minion return retry timer set to 6 seconds (randomized)
2017-10-05 08:52:15,818 [salt.minion ][INFO ][3116] Returning information for job: 20171005085218806107
2017-10-05 08:52:15,819 [salt.transport.zeromq][DEBUG ][3116] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506', 'aes')
2017-10-05 08:52:15,819 [salt.crypt ][DEBUG ][3116] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506')
2017-10-05 08:52:15,828 [salt.minion ][DEBUG ][3116] minion return: {'fun_args': ['20171005085143518712'], 'jid': '20171005085218806107', 'return': {'tgt_type': 'glob', 'jid': '20171005085143518712', 'tgt': 'AMZ.ENG_Docker_02.i-598f95c9', 'pid': 3092, 'ret': '', 'user': 'sudo_ssoto', 'arg': ['ts.deploy.telesto.CVE-2017-1000253', {'test': True, '__kwarg__': True}], 'fun': 'state.sls'}, 'retcode': 0, 'success': True, 'fun': 'saltutil.find_job'}
2017-10-05 08:52:25,904 [salt.minion ][INFO ][2945] User sudo_ssoto Executing command saltutil.find_job with jid 20171005085228916540
2017-10-05 08:52:25,904 [salt.minion ][DEBUG ][2945] Command details {'tgt_type': 'list', 'jid': '20171005085228916540', 'tgt': ['AMZ.ENG_Docker_02.i-598f95c9'], 'ret': '', 'user': 'sudo_ssoto', 'arg': ['20171005085143518712'], 'fun': 'saltutil.find_job'}
2017-10-05 08:52:25,912 [salt.minion ][INFO ][3121] Starting a new job with PID 3121
2017-10-05 08:52:25,925 [salt.utils.lazy ][DEBUG ][3121] LazyLoaded saltutil.find_job
2017-10-05 08:52:25,927 [salt.utils.lazy ][DEBUG ][3121] LazyLoaded direct_call.get
2017-10-05 08:52:25,928 [salt.minion ][DEBUG ][3121] Minion return retry timer set to 5 seconds (randomized)
2017-10-05 08:52:25,928 [salt.minion ][INFO ][3121] Returning information for job: 20171005085228916540
2017-10-05 08:52:25,928 [salt.transport.zeromq][DEBUG ][3121] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506', 'aes')
2017-10-05 08:52:25,929 [salt.crypt ][DEBUG ][3121] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506')
2017-10-05 08:52:25,938 [salt.minion ][DEBUG ][3121] minion return: {'fun_args': ['20171005085143518712'], 'jid': '20171005085228916540', 'return': {'tgt_type': 'glob', 'jid': '20171005085143518712', 'tgt': 'AMZ.ENG_Docker_02.i-598f95c9', 'pid': 3092, 'ret': '', 'user': 'sudo_ssoto', 'arg': ['ts.deploy.telesto.CVE-2017-1000253', {'test': True, '__kwarg__': True}], 'fun': 'state.sls'}, 'retcode': 0, 'success': True, 'fun': 'saltutil.find_job'}
2017-10-05 08:52:35,968 [salt.minion ][INFO ][2945] User sudo_ssoto Executing command saltutil.find_job with jid 20171005085238980257
2017-10-05 08:52:35,968 [salt.minion ][DEBUG ][2945] Command details {'tgt_type': 'list', 'jid': '20171005085238980257', 'tgt': ['AMZ.ENG_Docker_02.i-598f95c9'], 'ret': '', 'user': 'sudo_ssoto', 'arg': ['20171005085143518712'], 'fun': 'saltutil.find_job'}
2017-10-05 08:52:35,976 [salt.minion ][INFO ][3129] Starting a new job with PID 3129
2017-10-05 08:52:35,990 [salt.utils.lazy ][DEBUG ][3129] LazyLoaded saltutil.find_job
2017-10-05 08:52:35,991 [salt.utils.lazy ][DEBUG ][3129] LazyLoaded direct_call.get
2017-10-05 08:52:35,992 [salt.minion ][DEBUG ][3129] Minion return retry timer set to 9 seconds (randomized)
2017-10-05 08:52:35,993 [salt.minion ][INFO ][3129] Returning information for job: 20171005085238980257
2017-10-05 08:52:35,993 [salt.transport.zeromq][DEBUG ][3129] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506', 'aes')
2017-10-05 08:52:35,994 [salt.crypt ][DEBUG ][3129] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506')
2017-10-05 08:52:36,002 [salt.minion ][DEBUG ][3129] minion return: {'fun_args': ['20171005085143518712'], 'jid': '20171005085238980257', 'return': {'tgt_type': 'glob', 'jid': '20171005085143518712', 'tgt': 'AMZ.ENG_Docker_02.i-598f95c9', 'pid': 3092, 'ret': '', 'user': 'sudo_ssoto', 'arg': ['ts.deploy.telesto.CVE-2017-1000253', {'test': True, '__kwarg__': True}], 'fun': 'state.sls'}, 'retcode': 0, 'success': True, 'fun': 'saltutil.find_job'}
2017-10-05 08:52:46,090 [salt.minion ][INFO ][2945] User sudo_ssoto Executing command saltutil.find_job with jid 20171005085249103205
2017-10-05 08:52:46,091 [salt.minion ][DEBUG ][2945] Command details {'tgt_type': 'list', 'jid': '20171005085249103205', 'tgt': ['AMZ.ENG_Docker_02.i-598f95c9'], 'ret': '', 'user': 'sudo_ssoto', 'arg': ['20171005085143518712'], 'fun': 'saltutil.find_job'}
2017-10-05 08:52:46,099 [salt.minion ][INFO ][3136] Starting a new job with PID 3136
2017-10-05 08:52:46,112 [salt.utils.lazy ][DEBUG ][3136] LazyLoaded saltutil.find_job
2017-10-05 08:52:46,113 [salt.utils.lazy ][DEBUG ][3136] LazyLoaded direct_call.get
2017-10-05 08:52:46,114 [salt.minion ][DEBUG ][3136] Minion return retry timer set to 7 seconds (randomized)
2017-10-05 08:52:46,114 [salt.minion ][INFO ][3136] Returning information for job: 20171005085249103205
2017-10-05 08:52:46,115 [salt.transport.zeromq][DEBUG ][3136] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506', 'aes')
2017-10-05 08:52:46,115 [salt.crypt ][DEBUG ][3136] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506')
2017-10-05 08:52:46,124 [salt.minion ][DEBUG ][3136] minion return: {'fun_args': ['20171005085143518712'], 'jid': '20171005085249103205', 'return': {'tgt_type': 'glob', 'jid': '20171005085143518712', 'tgt': 'AMZ.ENG_Docker_02.i-598f95c9', 'pid': 3092, 'ret': '', 'user': 'sudo_ssoto', 'arg': ['ts.deploy.telesto.CVE-2017-1000253', {'test': True, '__kwarg__': True}], 'fun': 'state.sls'}, 'retcode': 0, 'success': True, 'fun': 'saltutil.find_job'}
2017-10-05 08:52:56,312 [salt.minion ][INFO ][2945] User sudo_ssoto Executing command saltutil.find_job with jid 20171005085259324891
2017-10-05 08:52:56,312 [salt.minion ][DEBUG ][2945] Command details {'tgt_type': 'list', 'jid': '20171005085259324891', 'tgt': ['AMZ.ENG_Docker_02.i-598f95c9'], 'ret': '', 'user': 'sudo_ssoto', 'arg': ['20171005085143518712'], 'fun': 'saltutil.find_job'}
2017-10-05 08:52:56,320 [salt.minion ][INFO ][3141] Starting a new job with PID 3141
2017-10-05 08:52:56,333 [salt.utils.lazy ][DEBUG ][3141] LazyLoaded saltutil.find_job
2017-10-05 08:52:56,335 [salt.utils.lazy ][DEBUG ][3141] LazyLoaded direct_call.get
2017-10-05 08:52:56,335 [salt.minion ][DEBUG ][3141] Minion return retry timer set to 9 seconds (randomized)
2017-10-05 08:52:56,336 [salt.minion ][INFO ][3141] Returning information for job: 20171005085259324891
2017-10-05 08:52:56,336 [salt.transport.zeromq][DEBUG ][3141] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506', 'aes')
2017-10-05 08:52:56,337 [salt.crypt ][DEBUG ][3141] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506')
2017-10-05 08:52:56,346 [salt.minion ][DEBUG ][3141] minion return: {'fun_args': ['20171005085143518712'], 'jid': '20171005085259324891', 'return': {'tgt_type': 'glob', 'jid': '20171005085143518712', 'tgt': 'AMZ.ENG_Docker_02.i-598f95c9', 'pid': 3092, 'ret': '', 'user': 'sudo_ssoto', 'arg': ['ts.deploy.telesto.CVE-2017-1000253', {'test': True, '__kwarg__': True}], 'fun': 'state.sls'}, 'retcode': 0, 'success': True, 'fun': 'saltutil.find_job'}
2017-10-05 08:52:59,976 [salt.pillar ][DEBUG ][3092] Determining pillar cache
2017-10-05 08:52:59,976 [salt.transport.zeromq][DEBUG ][3092] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506', 'aes')
2017-10-05 08:52:59,976 [salt.crypt ][DEBUG ][3092] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506')
2017-10-05 08:53:00,165 [salt.crypt ][DEBUG ][3092] Loaded minion key: /etc/salt/pki/minion/minion.pem
2017-10-05 08:53:00,169 [salt.state ][INFO ][3092] Loading fresh modules for state activity
2017-10-05 08:53:00,194 [salt.utils.lazy ][DEBUG ][3092] LazyLoaded jinja.render
2017-10-05 08:53:00,195 [salt.utils.lazy ][DEBUG ][3092] LazyLoaded yaml.render
2017-10-05 08:53:00,760 [salt.fileclient ][DEBUG ][3092] In saltenv 'base', looking at rel_path 'ts/deploy/telesto/CVE-2017-1000253.sls' to resolve 'salt://ts/deploy/telesto/CVE-2017-1000253.sls'
2017-10-05 08:53:00,761 [salt.fileclient ][DEBUG ][3092] In saltenv 'base', ** considering ** path '/var/cache/salt/minion/files/base/ts/deploy/telesto/CVE-2017-1000253.sls' to resolve 'salt://ts/deploy/telesto/CVE-2017-1000253.sls'
2017-10-05 08:53:00,761 [salt.template ][DEBUG ][3092] compile template: /var/cache/salt/minion/files/base/ts/deploy/telesto/CVE-2017-1000253.sls
2017-10-05 08:53:00,762 [salt.utils.jinja ][DEBUG ][3092] Jinja search path: ['/var/cache/salt/minion/files/base']
2017-10-05 08:53:00,765 [salt.template ][PROFILE ][3092] Time (in seconds) to render '/var/cache/salt/minion/files/base/ts/deploy/telesto/CVE-2017-1000253.sls' using 'jinja' renderer: 0.00362491607666
2017-10-05 08:53:00,765 [salt.template ][DEBUG ][3092] Rendered data from file: /var/cache/salt/minion/files/base/ts/deploy/telesto/CVE-2017-1000253.sls:
mitigate.cve.2017.1000253:
sysctl.present:
- name: vm.legacy_va_layout
- value: 1
2017-10-05 08:53:00,768 [salt.utils.lazy ][DEBUG ][3092] LazyLoaded config.get
2017-10-05 08:53:00,768 [salt.loaded.int.render.yaml][DEBUG ][3092] Results of YAML rendering:
OrderedDict([('mitigate.cve.2017.1000253', OrderedDict([('sysctl.present', [OrderedDict([('name', 'vm.legacy_va_layout')]), OrderedDict([('value', 1)])])]))])
2017-10-05 08:53:00,768 [salt.template ][PROFILE ][3092] Time (in seconds) to render '/var/cache/salt/minion/files/base/ts/deploy/telesto/CVE-2017-1000253.sls' using 'yaml' renderer: 0.00246500968933
2017-10-05 08:53:00,770 [salt.utils.lazy ][DEBUG ][3092] LazyLoaded sysctl.show
2017-10-05 08:53:00,770 [salt.utils.lazy ][DEBUG ][3092] LazyLoaded sysctl.present
2017-10-05 08:53:00,770 [salt.state ][INFO ][3092] Running state [vm.legacy_va_layout] at time 08:53:00.770680
2017-10-05 08:53:00,770 [salt.state ][INFO ][3092] Executing state sysctl.present for [vm.legacy_va_layout]
2017-10-05 08:53:00,778 [salt.utils.lazy ][DEBUG ][3092] LazyLoaded cmd.run_stdout
2017-10-05 08:53:00,779 [salt.loaded.int.module.cmdmod][INFO ][3092] Executing command 'sysctl -a' in directory '/root'
2017-10-05 08:53:00,856 [salt.state ][INFO ][3092] Sysctl option vm.legacy_va_layout might be changed, we failed to check config file at /etc/sysctl.d/99-salt.conf. The file is either unreadable, or missing.
2017-10-05 08:53:00,856 [salt.state ][INFO ][3092] Completed state [vm.legacy_va_layout] at time 08:53:00.856562 duration_in_ms=85.882
2017-10-05 08:53:00,856 [salt.state ][DEBUG ][3092] File /var/cache/salt/minion/accumulator/41143440 does not exist, no need to cleanup.
2017-10-05 08:53:00,858 [salt.minion ][DEBUG ][3092] Minion return retry timer set to 8 seconds (randomized)
2017-10-05 08:53:00,858 [salt.minion ][INFO ][3092] Returning information for job: 20171005085143518712
2017-10-05 08:53:00,859 [salt.transport.zeromq][DEBUG ][3092] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506', 'aes')
2017-10-05 08:53:00,859 [salt.crypt ][DEBUG ][3092] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'AMZ.ENG_Docker_02.i-598f95c9', 'tcp://10.0.8.18:4506')
2017-10-05 08:53:00,867 [salt.minion ][DEBUG ][3092] minion return: {'fun_args': ['ts.deploy.telesto.CVE-2017-1000253', {'test': True}], 'jid': '20171005085143518712', 'return': {'sysctl_|-mitigate.cve.2017.1000253_|-vm.legacy_va_layout_|-present': {'comment': 'Sysctl option vm.legacy_va_layout might be changed, we failed to check config file at /etc/sysctl.d/99-salt.conf. The file is either unreadable, or missing.', 'name': 'vm.legacy_va_layout', 'start_time': '08:53:00.770680', 'result': None, 'duration': 85.882, '__run_num__': 0, '__sls__': u'ts.deploy.telesto.CVE-2017-1000253', 'changes': {}, '__id__': 'mitigate.cve.2017.1000253'}}, 'retcode': 0, 'success': True, 'fun': 'state.sls'}
I am experiencing the same problem. This has something to do with state.sls|state.apply (salt.modules.state) or saltutil.find_job.
Specifically,
sudo salt 'minion' cmd.run 'echo 1'
returns immediately, while
sudo salt 'minion' state.apply some-state
takes a while, where some-state.sls is:
echo 1:
cmd.run:
- shell: /bin/bash
- runas: root
state.single is slow as well.
Master and minion stats are identical:
Salt Version:
Salt: 2017.7.2
Dependency Versions:
cffi: Not Installed
cherrypy: unknown
dateutil: Not Installed
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
ioflo: Not Installed
Jinja2: 2.7.2
libgit2: Not Installed
libnacl: Not Installed
M2Crypto: Not Installed
Mako: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.4.8
mysql-python: 1.2.5
pycparser: Not Installed
pycrypto: 2.6.1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 2.7.5 (default, Aug 4 2017, 00:39:18)
python-gnupg: Not Installed
PyYAML: 3.11
PyZMQ: 15.3.0
RAET: Not Installed
smmap: Not Installed
timelib: Not Installed
Tornado: 4.2.1
ZMQ: 4.1.4
System Versions:
dist: centos 7.4.1708 Core
locale: UTF-8
machine: x86_64
release: 3.10.0-693.5.2.el7.x86_64
system: Linux
version: CentOS Linux 7.4.1708 Core
Here's the complete log for the above state.apply
[INFO ] User sudo_rcrrad Executing command state.apply with jid 20171115193457614955
[DEBUG ] Command details {'tgt_type': 'glob', 'jid': '20171115193457614955', 'tgt': 'minion', 'ret': '', 'user': 'sudo_rcrrad', 'arg': ['some-state', {'cache': True, '__kwarg__': True}], 'fun': 'state.apply'}
[INFO ] Starting a new job with PID 1909
[DEBUG ] LazyLoaded state.apply
[DEBUG ] LazyLoaded direct_call.get
[DEBUG ] LazyLoaded saltutil.is_running
[DEBUG ] LazyLoaded grains.get
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'minion', 'tcp://10.156.0.5:4506', 'aes')
[DEBUG ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'minion', 'tcp://10.156.0.5:4506')
[INFO ] User sudo_rcrrad Executing command saltutil.find_job with jid 20171115193502678204
[DEBUG ] Command details {'tgt_type': 'list', 'jid': '20171115193502678204', 'tgt': ['minion'], 'ret': '', 'user': 'sudo_rcrrad', 'arg': ['20171115193457614955'], 'fun': 'saltutil.find_job'}
[INFO ] Starting a new job with PID 1915
[DEBUG ] LazyLoaded saltutil.find_job
[DEBUG ] LazyLoaded direct_call.get
[DEBUG ] Minion return retry timer set to 5 seconds (randomized)
[INFO ] Returning information for job: 20171115193502678204
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'minion', 'tcp://10.156.0.5:4506', 'aes')
[DEBUG ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'minion', 'tcp://10.156.0.5:4506')
[DEBUG ] minion return: {'fun_args': ['20171115193457614955'], 'jid': '20171115193502678204', 'return': {'tgt_type': 'glob', 'jid': '20171115193457614955', 'tgt': 'minion', 'pid': 1909, 'ret': '', 'user': 'sudo_rcrrad', 'arg': ['some-state', {'cache': True, '__kwarg__': True}], 'fun': 'state.apply'}, 'retcode': 0, 'success': True, 'fun': 'saltutil.find_job'}
[INFO ] User sudo_rcrrad Executing command saltutil.find_job with jid 20171115193512883122
[DEBUG ] Command details {'tgt_type': 'list', 'jid': '20171115193512883122', 'tgt': ['minion'], 'ret': '', 'user': 'sudo_rcrrad', 'arg': ['20171115193457614955'], 'fun': 'saltutil.find_job'}
[INFO ] Starting a new job with PID 1921
[DEBUG ] LazyLoaded saltutil.find_job
[DEBUG ] LazyLoaded direct_call.get
[DEBUG ] Minion return retry timer set to 7 seconds (randomized)
[INFO ] Returning information for job: 20171115193512883122
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'minion', 'tcp://10.156.0.5:4506', 'aes')
[DEBUG ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'minion', 'tcp://10.156.0.5:4506')
[DEBUG ] minion return: {'fun_args': ['20171115193457614955'], 'jid': '20171115193512883122', 'return': {'tgt_type': 'glob', 'jid': '20171115193457614955', 'tgt': 'minion', 'pid': 1909, 'ret': '', 'user': 'sudo_rcrrad', 'arg': ['some-state', {'cache': True, '__kwarg__': True}], 'fun': 'state.apply'}, 'retcode': 0, 'success': True, 'fun': 'saltutil.find_job'}
[DEBUG ] Determining pillar cache
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'minion', 'tcp://10.156.0.5:4506', 'aes')
[DEBUG ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'minion', 'tcp://10.156.0.5:4506')
[INFO ] User sudo_rcrrad Executing command saltutil.find_job with jid 20171115193523095218
[DEBUG ] Command details {'tgt_type': 'list', 'jid': '20171115193523095218', 'tgt': ['minion'], 'ret': '', 'user': 'sudo_rcrrad', 'arg': ['20171115193457614955'], 'fun': 'saltutil.find_job'}
[INFO ] Starting a new job with PID 1929
[DEBUG ] LazyLoaded saltutil.find_job
[DEBUG ] LazyLoaded direct_call.get
[DEBUG ] Minion return retry timer set to 8 seconds (randomized)
[INFO ] Returning information for job: 20171115193523095218
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'minion', 'tcp://10.156.0.5:4506', 'aes')
[DEBUG ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'minion', 'tcp://10.156.0.5:4506')
[DEBUG ] minion return: {'fun_args': ['20171115193457614955'], 'jid': '20171115193523095218', 'return': {'tgt_type': 'glob', 'jid': '20171115193457614955', 'tgt': 'minion', 'pid': 1909, 'ret': '', 'user': 'sudo_rcrrad', 'arg': ['some-state', {'cache': True, '__kwarg__': True}], 'fun': 'state.apply'}, 'retcode': 0, 'success': True, 'fun': 'saltutil.find_job'}
[DEBUG ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[INFO ] Loading fresh modules for state activity
[DEBUG ] LazyLoaded jinja.render
[DEBUG ] LazyLoaded yaml.render
[DEBUG ] LazyLoaded config.option
[DEBUG ] LazyLoaded cmd.run
[INFO ] Running state [echo 1] at time 19:35:24.427859
[INFO ] Executing state cmd.run for [echo 1]
[DEBUG ] LazyLoaded cmd.run_all
[INFO ] Executing command 'echo 1' as user 'root' in directory '/root'
[DEBUG ] stdout: 1
[INFO ] {'pid': 1944, 'retcode': 0, 'stderr': '', 'stdout': '1'}
[INFO ] Completed state [echo 1] at time 19:35:24.466936 duration_in_ms=39.078
[DEBUG ] File /var/cache/salt/minion/accumulator/39849552 does not exist, no need to cleanup.
[DEBUG ] Minion return retry timer set to 10 seconds (randomized)
[INFO ] Returning information for job: 20171115193457614955
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'minion', 'tcp://10.156.0.5:4506', 'aes')
[DEBUG ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'minion', 'tcp://10.156.0.5:4506')
[DEBUG ] minion return: {'fun_args': ['some-state', {'cache': True}], 'jid': '20171115193457614955', 'return': {'cmd_|-echo 1_|-echo 1_|-run': {'comment': 'Command "echo 1" run', 'name': 'echo 1', 'start_time': '19:35:24.427858', 'result': True, 'duration': 39.078, '__run_num__': 0, '__sls__': 'some-state', 'changes': {'pid': 1944, 'retcode': 0, 'stderr': '', 'stdout': '1'}, '__id__': 'echo 1'}}, 'retcode': 0, 'success': True, 'fun': 'state.apply'}
Master:
sudo salt 'minion' state.apply some-state -l debug
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Using cached minion ID from /etc/salt/minion_id: gcp-eu-west3-a-console-1.c.innovative-solution.internal
[DEBUG ] Missing configuration file: /root/.saltrc
[DEBUG ] Configuration file path: /etc/salt/master
[WARNING ] Insecure logging configuration detected! Sensitive data may be logged.
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Using cached minion ID from /etc/salt/minion_id: gcp-eu-west3-a-console-1.c.innovative-solution.internal
[DEBUG ] Missing configuration file: /root/.saltrc
[DEBUG ] MasterEvent PUB socket URI: /var/run/salt/master/master_event_pub.ipc
[DEBUG ] MasterEvent PULL socket URI: /var/run/salt/master/master_event_pull.ipc
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/master', 'master', 'tcp://127.0.0.1:4506', 'clear')
[DEBUG ] Initializing new IPCClient for path: /var/run/salt/master/master_event_pub.ipc
[DEBUG ] LazyLoaded local_cache.get_load
[DEBUG ] Reading minion list from /var/cache/salt/master/jobs/0b/26f993f05f13f5ea1bd44e7bd1d06041c109dc05a38e13b786f501f868d43b/.minions.p
[DEBUG ] get_iter_returns for jid 20171115193457614955 sent to set(['minion']) will timeout at 19:35:02.629067
[DEBUG ] Checking whether jid 20171115193457614955 is still running
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/master', 'master', 'tcp://127.0.0.1:4506', 'clear')
[DEBUG ] retcode missing from client return
[DEBUG ] Checking whether jid 20171115193457614955 is still running
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/master', 'master', 'tcp://127.0.0.1:4506', 'clear')
[DEBUG ] retcode missing from client return
[DEBUG ] Checking whether jid 20171115193457614955 is still running
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/master', 'master', 'tcp://127.0.0.1:4506', 'clear')
[DEBUG ] retcode missing from client return
[DEBUG ] jid 20171115193457614955 return from minion
[DEBUG ] return event: {'minion': {'jid': '20171115193457614955', 'retcode': 0, 'ret': {'cmd_|-echo 1_|-echo 1_|-run': {'comment': 'Command "echo 1" run', 'name': 'echo 1', 'start_time': '19:35:24.427858', 'result': True, 'duration': 39.078, '__run_num__': 0, '__sls__': 'some-state', 'changes': {'pid': 1944, 'retcode': 0, 'stderr': '', 'stdout': '1'}, '__id__': 'echo 1'}}, 'out': 'highstate'}}
[DEBUG ] LazyLoaded highstate.output
[DEBUG ] LazyLoaded nested.output
minion:
----------
ID: echo 1
Function: cmd.run
Result: True
Comment: Command "echo 1" run
Started: 19:35:24.427858
Duration: 39.078 ms
Changes:
----------
pid:
1944
retcode:
0
stderr:
stdout:
1
Summary for minion
------------
Succeeded: 1 (changed=1)
Failed: 0
------------
Total states run: 1
Total run time: 39.078 ms
[DEBUG ] jid 20171115193457614955 found all minions set(['minion'])
The same issue even with a small group of minions (~65 minions).
Salt Version:
Salt: 2017.7.2
Dependency Versions:
cffi: Not Installed
cherrypy: Not Installed
dateutil: 2.4.2
docker-py: Not Installed
gitdb: 0.6.4
gitpython: 1.0.1
ioflo: Not Installed
Jinja2: 2.8
libgit2: Not Installed
libnacl: Not Installed
M2Crypto: Not Installed
Mako: 1.0.3
msgpack-pure: Not Installed
msgpack-python: 0.4.6
mysql-python: Not Installed
pycparser: Not Installed
pycrypto: 2.6.1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 2.7.12 (default, Nov 19 2016, 06:48:10)
python-gnupg: Not Installed
PyYAML: 3.11
PyZMQ: 15.2.0
RAET: Not Installed
smmap: 0.9.0
timelib: Not Installed
Tornado: 4.2.1
ZMQ: 4.1.4
System Versions:
dist: Ubuntu 16.04 xenial
locale: UTF-8
machine: x86_64
release: 4.8.0-32-generic
system: Linux
version: Ubuntu 16.04 xenial
How many files does your setup have for all your saltenvs combined? (Files in each saltenv are counted independently even if they point to the same directories) We found out the hard way it has huge impact on state performance.
The point is, there were no major changes neither in config nor in files (not even in number of minions) ... and the performance dragging happened sometime between 2017.7.2
and 2017.7.0
or maybe starting of 2017.7.0
.
I'm having a hard time replicating this. In fact I noticed 2017.7.2 is faster then 2016.11.8 when testing with state.sls and the cmd.run state @vryabinin provided.
Can you either of you share a sanitized version of your master/minion config file?
Also is there anything else unique about your setup?
Confirming this as of 2018.3.0 (Oxygen)
Can you either of you share a sanitized version of your master/minion config file?
I use the default configuration.
Also is there anything else unique about your setup?
Nothing unique, but the issue seems to affect CentOS 6/7 machines. There is another discussion here https://github.com/saltstack/salt/issues/38538 mentitoning CentOS boxes
Turning off pillarstack external pillar helped to solve the issue
anyone else here using external pillar and can confirm the same thing?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.
Description of Issue/Question
Recently we have noticed that the salt-master has becoming very slow to the point that it can take 10 to 15 minutes for small jobs to (complete couple lines in a state file and about 20-30 minions) . After reviewing the box and the troubleshooting documentation we could find any errors. We have spun up a fresh build of the master and time for one state to be executed is taking 1 minute to come back. If i did more the one box at a time will time will escalate up to 10 to 15 minutes.
The only error i can find is when running the debug the master continuously claims that a return code is missing from client
This happens when attempting against any box in the fleet IE CentOS6 or CentOS7
Please note the node id does say docker but that is name of the host the state is not trying to be committed to a docker container.
Any help would be grateful as salt and effectively slowed down to a crawl and not reliable any more
Setup
(Please provide relevant configs and/or SLS files (Be sure to remove sensitive info).)
Please note this is just a script i am testing with as i need to apply this, but it is not just this script
Steps to Reproduce Issue
Versions Report
SALT MASTER
SALT MINION