jcjorel / ec2-spot-converter

A tool to convert AWS EC2 instances back and forth between On-Demand and Spot billing models.
MIT License
190 stars 20 forks source link

Failure while creating instance #6

Closed IvanNoronha-TomTom closed 3 years ago

IvanNoronha-TomTom commented 3 years ago

At step 16/25 the script fails with perhaps some permission issue. I used the policy.json attached in repo

[INFO] 2021-02-09 18:25:45,871 ec2-spot-converter - [STEP 14/25] Terminate instance...
[INFO] 2021-02-09 18:25:45,978 ec2-spot-converter -   => SUCCESS. Successfully terminated instance i-0g6j3gaeda57a3d.
[INFO] 2021-02-09 18:25:45,998 ec2-spot-converter - [STEP 15/25] Wait resource release...
[INFO] 2021-02-09 18:25:46,089 ec2-spot-converter - Waiting for detached ENIs to become 'available'...
[INFO] 2021-02-09 18:25:53,259 ec2-spot-converter -   => SUCCESS. All resources released : ['eni-018kf47ke05c84'].
[INFO] 2021-02-09 18:25:53,277 ec2-spot-converter - [STEP 16/25] Create new instance...
Traceback (most recent call last):
  File "./ec2-spot-converter", line 1471, in <module>
    sys.exit(main(sys.argv))
  File "./ec2-spot-converter", line 1437, in main
    return_code, reason, keys = step["Function"]()
  File "./ec2-spot-converter", line 939, in create_new_instance
    response = ec2_client.run_instances(**launch_specifications)
  File "/usr/local/lib/python3.6/dist-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.6/dist-packages/botocore/client.py", line 676, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (UnauthorizedOperation) when calling the RunInstances operation: You are not authorized to perform this operation.

The script gave an encoded message. I'll try to figure out decoding it

jcjorel commented 3 years ago

Hi, This is an usual error. By any chance, are there any boundaries applied to the role you are using or a SCP exists forbidding the call of RunInstances API?

IvanNoronha-TomTom commented 3 years ago

Thanks for the quick response @jcjorel ! I checked the permission boundary in IAM policies, image

And I didn't see anything in the roles section either related to permission boundary. EC2 has just the one IAM role attached which corresponds to the ec2SpotConverterRole.

I decoded the error message using the command mentioned at https://aws.amazon.com/premiumsupport/knowledge-center/ec2-not-auth-launch/ and drawing the equivalent I get

the request failed to call RunInstances because arn:aws:sts::xxxxxxxx:assumed-role/ec2SpotConverterRole/i-07be464kfo5fae6a5c didn't have permission to perform the iam:PassRole action on the arn:aws:iam::xxxxxxxx:role/s3FullAccess.

I'm curious though, does ec2SpotConverter use S3 at all? Or STS(arn:aws:sts -> AWS security token services)

jcjorel commented 3 years ago

The tool does not use STS.

Has the converter instance a role attached to it? If yes, could you try to add iam:PassRole to the ec2spotconverter role?

IvanNoronha-TomTom commented 3 years ago

I updated policy.json to include,

"iam:PassRole"

image

and the script created spot instance!

I think I figured why it needed extra perms. The source machine had an iam role associated, and I believe iam:PassRole is needed so the new spot instance also gets the same iam role attached to it image

Not sure but I think one way to recreating this issue would be to try converting an on-demand instance(which already has an iam role assigned to a spot instance)

IvanNoronha-TomTom commented 3 years ago

Thanks a lot @jcjorel for your time and help!! I noticed if there's any cloudwatch alarm associated with the on-demand instance, it breaks because the spot instance has a different instance id. While cloudwatch still points to old(and now non-existent) instance id

Would you recommend any action(other than just changing the cloudwatch alarm to point to the spot instance)? Given the nature of spot instances I believe having cloud watch alarm doesn't make sense if the ids keep changing due to termination

jcjorel commented 3 years ago

Thanks for having spotted the policy issue. I already updated the policy.json file with iam:PassRole.

Good point about CW alarms! I did not think about them. I do not know if it is possible to do something like ELB registration preservation feature but will look at it. Thanks.

PS: BTW, if the Spot instance is in stop behavior, it still makes sense to have CW Alarms on them as the insatnce Id is permanent.

jcjorel commented 3 years ago

@IvanNoronha-TomTom FYI, I released v0.10.0 that take cares of CloudWatch alarm update of converted Instance Id. Thanks again for "spotting" one bug and one missing feature!