aws / aws-parallelcluster

AWS ParallelCluster is an AWS supported Open Source cluster management tool to deploy and manage HPC clusters in the AWS cloud.
https://github.com/aws/aws-parallelcluster
Apache License 2.0
830 stars 312 forks source link

Pcluster create failed after specifying iam role #2616

Closed yinjun111 closed 3 years ago

yinjun111 commented 3 years ago

I tried to create a new torque cluster using ParallelCluster using a specific iam role for the new cluster. I was able to create the cluster without specifying "ec2_iam_role = ParallelClusterAdmin" in the config file. But add that, the creation failed. Moreover, I am using the exactly the same iam role in the EC2 instance to create the cluster, e.g. ParallelClusterAdmin

Here is the error message:

pcluster create torque04 -c Config/config.txt

Beginning cluster creation for cluster: torque04 WARNING: The configuration parameter 'scheduler' generated the following warnings: The job scheduler you are using (torque) is scheduled to be deprecated in future releases of ParallelCluster. More information is available here: https://github.com/aws/aws-parallelcluster/wiki/Deprecation-of-SGE-and-Torque-in-ParallelCluster Creating stack named: parallelcluster-torque04 Status: parallelcluster-torque04 - ROLLBACK_IN_PROGRESS Cluster creation failed. Failed events:

In Config/config.txt

[aws] aws_region_name = us-east-1

[aliases] ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS} -i /home/centos/Key/RNASeqDevKey.pem

[global] cluster_template = default update_check = true sanity_check = true

[cluster default] key_name = RNASeqDevKey

choose from torque, sge

scheduler = torque

base_os = centos7 vpc_settings = default

the only difference for a successful creation and failed creation

ec2_iam_role = ParallelClusterAdmin

master_instance_type = d3.4xlarge compute_instance_type = r5n.12xlarge

initial_queue_size = 1 max_queue_size = 10 maintain_initial_size = true

ebs_settings = ebs_data,ebs_apps

[vpc default] vpc_id = vpc-089cbe66e353e251c master_subnet_id = subnet-066c67d27e30498d3

[ebs ebs_data] shared_dir = data volume_size = 2000

[ebs ebs_apps] shared_dir = apps volume_size = 100

github-actions[bot] commented 3 years ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.