saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:
https://repo.saltproject.io/
Apache License 2.0
14.01k stars 5.47k forks source link

Timeouts (Minion did not return) since 2014.1.7 #14343

Closed Whissi closed 8 years ago

Whissi commented 10 years ago

Hi,

when there was no Salt activity for ~30-60 minutes, the next Salt command (for example salt '*' --show-timeout cmd.run 'uname -a' will timeout for most minions.

I need to run this command 2-3x times... after 1-2 minutes all minions are back and now I can run my desired command without any timeouts.

This didn't happen with 2014.1.5.

# salt --versions-report
           Salt: 2014.1.7
         Python: 2.7.7 (default, Jul  6 2014, 23:14:10)
         Jinja2: 2.7.3
       M2Crypto: 0.22
 msgpack-python: 0.4.2
   msgpack-pure: Not Installed
       pycrypto: 2.6.1
         PyYAML: 3.11
          PyZMQ: 14.3.1
            ZMQ: 4.0.4
Whissi commented 8 years ago

Not sure, we are still pinging all minions every 5 minutes to prevent this problem. I'll stop the cronjob next week to see if this is still happening with salt-2015.8.3 and update the issue report.

xyoun commented 8 years ago

@Whissi it seems that 2015.5.3 and upper has resolved this problem. I have 500+ servers run at once, not one timeout.

basepi commented 8 years ago

@xyoun That's great news! Hopefully that's the case with @Whissi as well!

frogunder commented 8 years ago

@Whissi - Did you have time to update? Did it fix it?

Whissi commented 8 years ago

After running salt-2015.8.3 over one week I haven't seen this issue again. But this can be a false positive because we currently have at least one salt event per 30min. But let consider this as fixed for now. Thanks!

basepi commented 8 years ago

Thanks for the update @Whissi