Providers should be memoized

I started messing around with adding tracing to stacker, and quickly found out that we have a pretty big performance bottleneck caused by calling get_cloudformation_client, which is ultimately called from ProviderBuilder to instantiate a new Provider instance.

In my tests, this takes up a disproportionately large amount of stacker build time; between 10-40% of total execution time. This is huge. In our stack config of ~150ish stacks, ~48 seconds are spent calling get_cloudformation_client:

Why get_cloudformation_client is so slow is somewhat surprising. I'm assuming there's a lot going on behind the scenes when instantiating a new boto3 client, like attempting to fetch credentials from EC2 instance metadata and what not.

I think a good fix for this is to add memoization to aws.default.ProviderBuilder, where the cache key is a hash of the region/profile. Provider should be thread safe, and safe to share across multiple steps. This would look something like:

     def build(self, region=None, profile=None):
         if not region:
             region = self.region
         session = get_session(region=region, profile=profile)
-        return Provider(session, region=region, **self.kwargs)
+        with self.lock:
+            key = hash("%s%s" % (region, profile))
+            if key not in self.providers:
+                self.providers[key] = Provider(session, region=region, **self.kwargs)
+            return self.providers[key]

cloudtools / stacker

Providers should be memoized #630