Netflix-Skunkworks / aardvark

Aardvark is a multi-account AWS IAM Access Advisor API
Apache License 2.0
472 stars 77 forks source link

Boto3 Client Threading Error #14

Closed laurajauch closed 7 years ago

laurajauch commented 7 years ago

Hello -

I'm attempting to run aardvark update inside a docker container at aws ecs. However, when I try to run it against two accounts, I'm getting this error:

Exception in thread Thread-2: Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 810, in bootstrap_inner self.run() File "/apps/aardvark/aardvark/aardvark/manage.py", line 48, in run ret_code, aa_data = account.update_account() File "/apps/aardvark/aardvark/aardvark/updater/init.py", line 39, in update_account arns = self._get_arns() File "/apps/aardvark/aardvark/aardvark/updater/init.py", line 62, in _get_arns 'iam', service_type='client', *self.conn_details) File "/usr/local/lib/python2.7/dist-packages/cloudaux-1.3.6-py2.7.egg/cloudaux/aws/decorators.py", line 35, in decorated_function retval = f(args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/cloudaux-1.3.6-py2.7.egg/cloudaux/aws/sts.py", line 91, in boto3_cached_conn sts = boto3.client('sts') File "/usr/local/lib/python2.7/dist-packages/boto3/init__.py", line 83, in client return _get_default_session().client(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/boto3/session.py", line 263, in client aws_session_token=aws_session_token, config=config) File "/usr/local/lib/python2.7/dist-packages/botocore/session.py", line 826, in create_client endpoint_resolver = self.get_component('endpoint_resolver') File "/usr/local/lib/python2.7/dist-packages/botocore/session.py", line 701, in get_component return self._components.get_component(name) File "/usr/local/lib/python2.7/dist-packages/botocore/session.py", line 901, in get_component del self._deferred[name] KeyError: 'endpoint_resolver'

It appears to be because boto3's default client isn't thread safe (recommended solution: give each thread it's own session), but I'm confused about how nobody else would have run into it normally. Thread-1 runs and completes successfully. Perhaps it's something related to ecs. Any ideas?

mcpeak commented 7 years ago

This is interesting, is it happening deterministically?

laurajauch commented 7 years ago

Yes. Which thread throws the error changes (whoever arrives second), but that the error gets thrown does not.

mcpeak commented 7 years ago

Strange! Yeah, I've never seen this. We run 5 threads across multiple accounts daily.

mcpeak commented 7 years ago

Definitely sounds like a bug but I'd fix it with lower priority unless others chime in they're seeing this issue too. Would happily accept a PR :). Maybe to mitigate temporarily you can run with one thread? I used to do this across our accounts and it just takes a bit longer.

laurajauch commented 7 years ago

Alright. I think the change probably belongs in cloudaux. I'll try get it working and see about a PR. Thanks!