Open MustaphaU opened 1 month ago
I was able to get to this point where the headnode
, and compute
nodes were created by setting ElasticIp
and AssignPublicIp
to true
in the configuration file. However, the cluster creation still fails.
Not resolved yet. @hoai
@MustaphaU Dear Mustapha,
I found a detailed error "WaitCondition received failed message: 'Failed to mount FSX. Please check /var/log/chef-client.log in the head node, or check the chef-client.log in CloudWatch logs. Please refer to https://docs.aws.amazon.com/parallelcluster/latest/ug/troubleshooting-v3.html for more details.' for uniqueId: i-0c3db5db03b82b28c" Today I will try to fix it. and let you know if it is working.
Thank you. Regards.
Thanks @hoai . Looking forward to it.
Hi, thanks for creating and sharing this incredible project!
For some reason, I am unable to create the ParallelCluster.
To troubleshoot, I reviewed the event logs in cloudformation and it looks like the failure is due to the
HeadNodeWaitCondition
Here is my custom command to create the config file, some details are redacted:
..and to create the cluster:
I also checked and noticed that I do not currently have any quota for the
g5.2xlarge
instance for compute nodes as specified in thecluster-config-template.yaml
, so I have requested an increase in the meantime.Any idea how I can resolve this?
Edit:
I have added some relevant part of the logs from HeadNode:
Edit 2: quota increase for the compute node resources (i.e. g5.2xlarge) approved but the error persists.