saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:
https://repo.saltproject.io/
Apache License 2.0
14.17k stars 5.48k forks source link

Salt pillar rendering is sometimes slower than I'd like #57926

Open mbirtwell opened 4 years ago

mbirtwell commented 4 years ago

I'm having trouble when running highstates that target lots of machines that they all take a while to get going. It seems that they effectively all end up queuing for the master to generate their pillar. I did some profiling of pillar generation for the salt master and in some cases the pillar generation was taking 15 seconds. 8 seconds of which was the deepcopy at the start of merge_recurse. Two approaches to removing this deep copy occur to me:

  1. Just removing it. Most places that merge is called seem to pass the only reference to a dictonary as obj_a so that dictionary could be safely mutated. The downside of this is that all callers, which aren't safe would need to deepcopy obj_a themselves. So far I've identified merge and merge_all in salt.modules.slsutil and there's a path in building the pillar if the ext_pillar_first option is set that I think should still have one deepcopy.
  2. Provide an alternative cheap merging strategy that could be used instead. I believe that we've been careful and each pillar file defines a unique top level key (or keys). So my suggestion would be a strategy that fails if there are any top level keys duplicated between the two dictionaries "unique". The problem with this is it wouldn't make much sense for merging included pillar files. At the moment the same merge strategy is used for pillar files that are merged together because they are co-mentioned in the top and for files that are explicitly included in the other. So I think this would have to be combined with allowing those two strategies to be set separately. The include strategy would default to the main strategy if not set.

I'd be happy to take a punt at implementing either of these or if you have a better idea I'm all ears.

mbirtwell commented 4 years ago

I implemented (1) removing the deepcopy from merge_recurse and updating the usages where I think it's necessary to deepcopy before passing the argument in. It's definitely provided an improvement. At the moment I've made the minimal change. I think if I was to propose this for inclusion I'd change merge to not return the result so that it's clearer that it's mutating the arguments. But would be interested in some feedback on this as an idea.