Open qcpeter opened 2 years ago
Hey @qcpeter thanks for the suggestion!
FWIW, we've discussed deprecating syndics, though I'm not sure if we have a specific replacement in place. In any case, it does seem like a good thing to add (in particular the worker_threads
bit) - if you have any of your own suggestions, PRs are always welcome and encouraged!
Hi @waynew,
Thanks for the reply. That's very intriguing why would you plan to do that? It seems that syndics or something very similar are the only way to share the responsibilities for pillar rendering between hosts whilst also allowing commands to be issued to a large number of hosts. Would the suggesting be to scale vertically rather than horizontally?
If there's no current guidance on the number of worker threads then I'll see what my experimentation brings up and report back.
Can someone update the title to not be blank?
Hi @waynew,
Thanks for the reply. That's very intriguing why would you plan to do that? It seems that syndics or something very similar are the only way to share the responsibilities for pillar rendering between hosts whilst also allowing commands to be issued to a large number of hosts. Would the suggesting be to scale vertically rather than horizontally?
The problem with syndic is it was a band-aid solution to the problem. It doesn't use most of the modern concepts in salt. and can be rather buggy. and a pain to work on the bugs that do crop up. there is a better solution in Saltstack Config. However a better open source solution does not currently exist. The only reason we have not actually depreciating syndic yet is that it doesn't have a replacement for scaling in open source yet.
If there's no current guidance on the number of worker threads then I'll see what my experimentation brings up and report back.
so, worker_threads in general should not be more then 1.5 times the number of cpus. more issues can arise from putting more as having more threads causes cpu multiplexing issues as these threads want to be high available. but with more threads then cpu you end up context switching timeouts between worker threads. this can cause a drop in queue or just no response to the queue.
Description I'm struggling to get a feel for how to tune the
worker_threads
value for a multi-syndic/multi-master setup. I'm seeing issues with the syndic processes being able to callfire_master
and also not being able to publish results to the master, however our monitoring does not indicate any excess load or CPU utilization on any of the masters. Currently this is set to 4 times the number of cores of the master host. Trying to figure this out be trial and error is quite difficult on a production cluster.Suggested Fix Some indication of how this parameter should scale with the number of cores and how to determine whether increasing the number of worker threads is appropriate would be really helpful, the
worker_threads
parameter isn't even mentioned on your Salt at Scale page.Type of documentation Salt documentation
Location or format of documentation https://docs.saltproject.io/en/latest/ref/configuration/master.html#worker-threads