aws / aws-parallelcluster

AWS ParallelCluster is an AWS supported Open Source cluster management tool to deploy and manage HPC clusters in the AWS cloud.
https://github.com/aws/aws-parallelcluster
Apache License 2.0
833 stars 312 forks source link

Create Cluster fails with ec2_role_name #1827

Closed mjavadi closed 4 years ago

mjavadi commented 4 years ago

Environment:

/ CfnCluster version [e.g. aws-parallelcluster-2.5.1]

[global] cluster_template = default update_check = true sanity_check = true

[aliases] ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS}

[cluster default] key_name = SecurityMonkeyOregon base_os = ubuntu1804 scheduler = slurm max_queue_size = 1 maintain_initial_size = true vpc_settings = default tags = {"Name": "test-plcuster", "Creator": "xyz@abc.com"} ec2_iam_role = am-ParallelClusterInstance

[vpc default] vpc_id = vpc-6a79870f master_subnet_id = subnet-9839f2fd`

Error on pcluster create: IAM role error on user provided role am-ParallelClusterInstance: action ec2:DescribeVolumes is explicitDeny. See https://docs.aws.amazon.com/parallelcluster/latest/ug/iam.html IAM role error on user provided role am-ParallelClusterInstance: action ec2:AttachVolume is explicitDeny. See https://docs.aws.amazon.com/parallelcluster/latest/ug/iam.html IAM role error on user provided role am-ParallelClusterInstance: action ec2:DescribeInstanceAttribute is explicitDeny. See https://docs.aws.amazon.com/parallelcluster/latest/ug/iam.html IAM role error on user provided role am-ParallelClusterInstance: action ec2:DescribeInstanceStatus is explicitDeny. See https://docs.aws.amazon.com/parallelcluster/latest/ug/iam.html IAM role error on user provided role am-ParallelClusterInstance: action ec2:DescribeInstances is explicitDeny. See https://docs.aws.amazon.com/parallelcluster/latest/ug/iam.html

Policy specified: { "Version": "2012-10-17", "Statement": [ { "Action": [ "ec2:DescribeVolumes", "ec2:AttachVolume", "ec2:DescribeInstanceAttribute", "ec2:DescribeInstanceStatus", "ec2:DescribeInstances", "ec2:DescribeRegions" ], "Resource": [ "*" ], "Effect": "Allow", "Sid": "EC2" }, { "Action": [ "dynamodb:ListTables" ], "Resource": [ "*" ], "Effect": "Allow", "Sid": "DynamoDBList" }, { "Action": [ "sqs:SendMessage", "sqs:ReceiveMessage", "sqs:ChangeMessageVisibility", "sqs:DeleteMessage", "sqs:GetQueueUrl" ], "Resource": [ "arn:aws:sqs:us-west-2:515121255512:parallelcluster-*" ], "Effect": "Allow", "Sid": "SQSQueue" }, { "Action": [ "autoscaling:DescribeAutoScalingGroups", "autoscaling:TerminateInstanceInAutoScalingGroup", "autoscaling:SetDesiredCapacity", "autoScaling:UpdateAutoScalingGroup", "autoscaling:DescribeTags", "autoScaling:SetInstanceHealth" ], "Resource": [ "*" ], "Effect": "Allow", "Sid": "Autoscaling" }, { "Action": [ "cloudformation:DescribeStacks", "cloudformation:DescribeStackResource" ], "Resource": [ "arn:aws:cloudformation:us-west-2:515121255512:stack/parallelcluster-*/*" ], "Effect": "Allow", "Sid": "CloudFormation" }, { "Action": [ "dynamodb:PutItem", "dynamodb:Query", "dynamodb:GetItem", "dynamodb:DeleteItem", "dynamodb:DescribeTable" ], "Resource": [ "arn:aws:dynamodb:us-west-2:515121255512:table/parallelcluster-*" ], "Effect": "Allow", "Sid": "DynamoDBTable" }, { "Action": [ "s3:GetObject" ], "Resource": [ "arn:aws:s3:::us-west-2-aws-parallelcluster/*" ], "Effect": "Allow", "Sid": "S3GetObj" }, { "Resource": [ "*" ], "Action": [ "sqs:ListQueues" ], "Effect": "Allow", "Sid": "SQSList" }, { "Action": [ "iam:PassRole" ], "Resource": [ "arn:aws:iam::515121255512:role/parallelcluster-*" ], "Effect": "Allow", "Sid": "BatchJobPassRole" }, { "Action": [ "s3:GetObject" ], "Resource": [ "arn:aws:s3:::dcv-license.us-west-2/*" ], "Effect": "Allow", "Sid": "DcvLicense" } ] }

mjavadi commented 4 years ago

Please note the ec2_role am-ParallelClusterInstance has a security boundary attached to it. If security boundary is removed, it would work. If you put the security boundary on it, it will not work. The security boundary is as follows: { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "NotAction": [ "iam:*", "organizations:*", "account:*" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "iam:Get*", "iam:List*", "iam:PassRole", "account:ListRegions" ], "Resource": "*" }, { "Effect": "Deny", "NotAction": [ "iam:Get*", "iam:List*" ], "Resource": [ "arn:aws:iam::55512121212:policy/Account_Owner_Role", "arn:aws:iam::55512121212:policy/CloudCheckr*", "arn:aws:iam::55512121212:policy/Config_S3_Delivery", "arn:aws:iam::55512121212:policy/IAM_Allow*", "arn:aws:iam::55512121212:policy/Power_User_Role", "arn:aws:iam::55512121212:policy/Security*", "arn:aws:iam::55512121212:policy/SplunkAccess", "arn:aws:iam::55512121212:role/Administrator", "arn:aws:iam::55512121212:role/advanceduser", "arn:aws:iam::55512121212:role/Auditor", "arn:aws:iam::55512121212:role/CloudManagement", "arn:aws:iam::55512121212:role/Forensics", "arn:aws:iam::55512121212:role/account_owner", "arn:aws:iam::55512121212:role/power_user", "arn:aws:iam::55512121212:role/JPL_Config_Role", "arn:aws:iam::55512121212:role/SplunkAccess", "arn:aws:iam::55512121212:role/networking", "arn:aws:iam::55512121212:role/ops", "arn:aws:iam::55512121212:policy/governance/*", "arn:aws:iam::55512121212:role/governance/*", "arn:aws:iam::55512121212:user/governance/*", "arn:aws:iam::55512121212:group/governance/*" ] }, { "Effect": "Deny", "Action": [ "config:DeleteAggregationAuthorization", "config:DeleteConfigurationAggregator", "config:DeleteConfigurationRecorder", "config:DeleteDeliveryChannel", "config:DeletePendingAggregationRequest", "config:PutAggregationAuthorization", "config:PutConfigurationAggregator", "config:PutConfigurationRecorder", "config:PutDeliveryChannel", "config:SartConfigurationRecorder", "config:StopConfigurationRecorder" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "guardduty:Archive*", "guardduty:Create*", "guardduty:Delete*", "guardduty:Update*", "guardduty:Disassociate*", "guardduty:Accept*", "guardduty:*MonitoringMembers", "guardduty:Unarchive*", "guardduty:Update*", "guardduty:InviteMembers", "guardduty:Decline*" ], "Resource": "*" }, { "Effect": "Deny", "Action": [ "acm:RequestCertificate", "aws-marketplace:Subscribe", "ec2:PurchaseReservedInstancesOffering", "workspaces:*" ], "Resource": [ "*" ] }, { "Effect": "Deny", "Action": "ec2:RunInstances", "Resource": "arn:aws:ec2:*::image/ami-*", "Condition": { "StringNotEquals": { "ec2:Owner": [ "55512121212", "55512121213", "amazon", "aws-marketplace", "55512121214", "55512121215", "55512121216", "55512121217" ] } } }, { "Effect": "Deny", "Action": [ "ec2:*" ], "Resource": "*", "Condition": { "StringNotLike": { "ec2:Region": [ "us-*" ] } } } ] }

ddeidda commented 4 years ago

Hi Mike,

as you correctly pointed out the policy boundary is preventing the ec2:DescribeVolumes action to be executed, the reason being probably this statement:

{ "Effect": "Deny", "Action": [ "ec2:*" ], "Resource": "*", "Condition": { "StringNotLike": { "ec2:Region": [ "us-*" ] } } }

and the fact that the ec2:DescribeVolumes action does not use the ec2:Region condition key as shown here. In order for you to be able to create a cluster your role's policy must be compliant with the ParallelClusterInstancePolicy.

Additional details can be found into the page AWS Identity and Access Management Roles in AWS ParallelCluster of our user guide.

mjavadi commented 4 years ago

@ddeidda I am sorry. I did not even get a chance to update the GitHub after I slacked Joe. We humbly like to push back that we do need to have this in our security security boundary specifically since pcluster config does ask which region we would like to use. For now, we can get buy by via asking an admin in our cyber division to create the policy ahead of time for the user.

Thank you kindly for your time.

ddeidda commented 4 years ago

@mjavadi no problem; I hope the issue has been solved and the user experience with ParallelCluster from now on is good.