neoave / mrack

Multicloud use-case based multihost async provisioner for CIs and testing during development
Apache License 2.0
11 stars 14 forks source link

AWS Provisioning failing when ec2 has some peeks in performance? #173

Closed Tiboris closed 2 years ago

Tiboris commented 2 years ago

Last days I have experienced some weird behavior in AWS provider and as I was exploring ec2 webUI i found out that instance has been in pending state which might (as i was too slow to get the ec2 instance details in webUI and i saw only this state) cause the mrack to fail later and finishing with this traceback:

2022-04-04T12:10:18     AWS Provisioning issued
2022-04-04T12:10:18     AWS Waiting for all hosts to be active
2022-04-04T12:10:18     An unexpected exception occurred while provisioning
2022-04-04T12:10:18     An unexpected exception occurred while provisioning
2022-04-04T12:10:18     An error occurred (InvalidInstanceID.NotFound) when calling the CreateTags operation: The instance ID 'i-0db1b3f72f0363db7' does not exist
2022-04-04T12:10:18     Traceback (most recent call last):
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/run.py", line 201, in handle
2022-04-04T12:10:18         ret_code = func(*args, **kwargs)
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/run.py", line 225, in run
2022-04-04T12:10:18         mrackcli(obj={})  # pylint: disable=no-value-for-parameter,unexpected-keyword-arg
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/click/core.py", line 829, in __call__
2022-04-04T12:10:18         return self.main(*args, **kwargs)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/click/core.py", line 782, in main
2022-04-04T12:10:18         rv = self.invoke(ctx)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
2022-04-04T12:10:18         return _process_result(sub_ctx.command.invoke(sub_ctx))
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
2022-04-04T12:10:18         return ctx.invoke(self.callback, **ctx.params)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/click/core.py", line 610, in invoke
2022-04-04T12:10:18         return callback(*args, **kwargs)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/click/decorators.py", line 21, in new_func
2022-04-04T12:10:18         return f(get_current_context(), *args, **kwargs)
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/run.py", line 62, in wrapper
2022-04-04T12:10:18         return loop.run_until_complete(func(*args, **kwargs))
2022-04-04T12:10:18       File "/usr/lib64/python3.9/asyncio/base_events.py", line 642, in run_until_complete
2022-04-04T12:10:18         return future.result()
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/run.py", line 122, in up
2022-04-04T12:10:18         await up_action.provision()
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/actions/up.py", line 106, in provision
2022-04-04T12:10:18         raise results
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/providers/provider.py", line 377, in provision_hosts
2022-04-04T12:10:18         success_hosts, error_hosts, _missing_reqs = await self.strategy_abort(reqs)
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/providers/provider.py", line 433, in strategy_abort
2022-04-04T12:10:18         return await self._provision_base(reqs)
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/providers/provider.py", line 318, in _provision_base
2022-04-04T12:10:18         raise response
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/providers/aws.py", line 274, in create_server
2022-04-04T12:10:18         self.ec2.create_tags(Resources=ids, Tags=taglist)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/boto3/ec2/createtags.py", line 27, in create_tags
2022-04-04T12:10:18         self.meta.client.create_tags(**kwargs)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/botocore/client.py", line 391, in _api_call
2022-04-04T12:10:18         return self._make_api_call(operation_name, kwargs)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/botocore/client.py", line 719, in _make_api_call
2022-04-04T12:10:18         raise error_class(parsed_response, operation_name)
2022-04-04T12:10:18     botocore.exceptions.ClientError: An error occurred (InvalidInstanceID.NotFound) when calling the CreateTags operation: The instance ID 'i-0db1b3f72f0363db7' does not exist
2022-04-04T12:10:18     Traceback (most recent call last):
2022-04-04T12:10:18       File "/usr/local/bin/mrack", line 23, in <module>
2022-04-04T12:10:18         run.run()
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/run.py", line 214, in handle
2022-04-04T12:10:18         raise exc
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/run.py", line 201, in handle
2022-04-04T12:10:18         ret_code = func(*args, **kwargs)
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/run.py", line 225, in run
2022-04-04T12:10:18         mrackcli(obj={})  # pylint: disable=no-value-for-parameter,unexpected-keyword-arg
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/click/core.py", line 829, in __call__
2022-04-04T12:10:18         return self.main(*args, **kwargs)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/click/core.py", line 782, in main
2022-04-04T12:10:18         rv = self.invoke(ctx)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
2022-04-04T12:10:18         return _process_result(sub_ctx.command.invoke(sub_ctx))
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
2022-04-04T12:10:18         return ctx.invoke(self.callback, **ctx.params)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/click/core.py", line 610, in invoke
2022-04-04T12:10:18         return callback(*args, **kwargs)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/click/decorators.py", line 21, in new_func
2022-04-04T12:10:18         return f(get_current_context(), *args, **kwargs)
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/run.py", line 62, in wrapper
2022-04-04T12:10:18         return loop.run_until_complete(func(*args, **kwargs))
2022-04-04T12:10:18       File "/usr/lib64/python3.9/asyncio/base_events.py", line 642, in run_until_complete
2022-04-04T12:10:18         return future.result()
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/run.py", line 122, in up
2022-04-04T12:10:18         await up_action.provision()
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/actions/up.py", line 106, in provision
2022-04-04T12:10:18         raise results
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/providers/provider.py", line 377, in provision_hosts
2022-04-04T12:10:18         success_hosts, error_hosts, _missing_reqs = await self.strategy_abort(reqs)
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/providers/provider.py", line 433, in strategy_abort
2022-04-04T12:10:18         return await self._provision_base(reqs)
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/providers/provider.py", line 318, in _provision_base
2022-04-04T12:10:18         raise response
2022-04-04T12:10:18       File "/usr/local/lib/python3.9/site-packages/mrack/providers/aws.py", line 274, in create_server
2022-04-04T12:10:18         self.ec2.create_tags(Resources=ids, Tags=taglist)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/boto3/ec2/createtags.py", line 27, in create_tags
2022-04-04T12:10:18         self.meta.client.create_tags(**kwargs)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/botocore/client.py", line 391, in _api_call
2022-04-04T12:10:18         return self._make_api_call(operation_name, kwargs)
2022-04-04T12:10:18       File "/usr/lib/python3.9/site-packages/botocore/client.py", line 719, in _make_api_call
2022-04-04T12:10:18         raise error_class(parsed_response, operation_name)
2022-04-04T12:10:18     botocore.exceptions.ClientError: An error occurred (InvalidInstanceID.NotFound) when calling the CreateTags operation: The instance ID 'i-0db1b3f72f0363db7' does not exist
pvoborni commented 2 years ago

Do you know if the subnet had enough IP addresses? I think I encountered a behavior similar to this when a subnet ran out of IPs.

Tiboris commented 2 years ago

Hmm i have no idea TBH, however there were only 3 instances in this state and no more were running. Is the pool of IP addresses per project?