aws-samples / aws-parallelcluster-monitoring

Monitoring Dashboard for AWS ParallelCluster
MIT No Attribution
31 stars 23 forks source link

Docs incorrectly suggest comma separated post_install_args with commas #21

Closed alfred-stokespace closed 1 year ago

alfred-stokespace commented 2 years ago

The docs suggest this...

[cluster yourcluster]
...
post_install = https://raw.githubusercontent.com/aws-samples/aws-parallelcluster-monitoring/main/post-install.sh
post_install_args = https://github.com/aws-samples/aws-parallelcluster-monitoring/tarball/main,aws-parallelcluster-monitoring,install-monitoring.sh
additional_iam_policies = arn:aws:iam::aws:policy/CloudWatchFullAccess,arn:aws:iam::aws:policy/AWSPriceListServiceFullAccess,arn:aws:iam::aws:policy/AmazonSSMFullAccess,arn:aws:iam::aws:policy/AWSCloudFormationReadOnlyAccess
tags = {"Grafana" : "true"}

But that fails due to the scripts incorrect assumption that

post_install_args = https://github.com/aws-samples/aws-parallelcluster-monitoring/tarball/main,aws-parallelcluster-monitoring,install-monitoring.sh

is a bash array initialization. Which it's not. So the script then fails at this point ...


monitoring_url=${cfn_postinstall_args[0]}
monitoring_dir_name=${cfn_postinstall_args[1]}               <===== THIS IS EMPTY STRING
monitoring_tarball="${monitoring_dir_name}.tar.gz"        <===== THIS IS ".tar.gz"
setup_command=${cfn_postinstall_args[2]}
monitoring_home="/home/${cfn_cluster_user}/${monitoring_dir_name}"

case ${cfn_node_type} in
    MasterServer)
        wget ${monitoring_url} -O ${monitoring_tarball}     <===== THIS TRYS TO SAVE ".tar.gz"
        mkdir -p ${monitoring_home}                                   <===== THIS IS IMPACTED BY EMPTY STRING
        tar xvf ${monitoring_tarball} -C ${monitoring_home} --strip-components 1  <==== THIS JUST FAILS
    ;;
    ComputeFleet)

    ;;
esac

What I did to fix the issue was hack the script up to change the post_install_args to be like this...

cfn_postinstall_args=("https://github.com/aws-samples/aws-parallelcluster-monitoring/tarball/main" "aws-parallelcluster-monitoring" "install-monitoring.sh")

And I was able to get an exit code of 0 :)

murugappanp commented 2 years ago

I am having the same problem..can you please share your V2 config file..I badly need it

alfred-stokespace commented 2 years ago

I am having the same problem..can you please share your V2 config file..I badly need it

@murugappanp the work around is very simple, replace commas with spaces and double-quotes are optional depending on the value, so for example... if the docs tell you to do this

post_install_args = https://github.com/aws-samples/aws-parallelcluster-monitoring/tarball/main,aws-parallelcluster-monitoring,install-monitoring.sh

you ignore that for reasons I stated above and instead do this...

post_install_args = ("https://github.com/aws-samples/aws-parallelcluster-monitoring/tarball/main" "aws-parallelcluster-monitoring" "install-monitoring.sh")

The goal there being to format the value of post_install_args as valid bash array initialization. The double-quotes might be optional, check out google for examples of proper bash arrays like ... https://opensource.com/article/18/5/you-dont-know-bash-intro-bash-arrays

murugappanp commented 2 years ago

Thanks Alfred..I just tried the same and still the cluster does not build...The error is below and I have attached my config. as well my version is 2.11.5

Cluster creation failed. Failed events:

[aws] aws_region_name = us-east-1

[global] cluster_template = default update_check = false sanity_check = true

[vpc public] vpc_id = xxx master_subnet_id = xxx additional_sg = xxx

[cluster default] key_name = mykey base_os = alinux2 scheduler = slurm master_instance_type = t2.micro s3_read_write_resource = * vpc_settings = public ebs_settings = myebs queue_settings = compute post_install = https://raw.githubusercontent.com/aws-samples/aws-parallelcluster-monitoring/main/post-install.sh post_install_args = ("https://github.com/aws-samples/aws-parallelcluster-monitoring/tarball/main" "aws-parallelcluster-monitoring" "install-monitoring.sh") additional_iam_policies = arn:aws:iam::aws:policy/CloudWatchFullAccess,arn:aws:iam::aws:policy/AWSPriceListServiceFullAccess,arn:aws:iam::aws:policy/AmazonSSMFullAccess,arn:aws:iam::aws:policy/AWSCloudFormationReadOnlyAccess tags = {“Grafana” : “true”}

[queue compute] compute_resource_settings = default disable_hyperthreading = true placement_group = DYNAMIC

[compute_resource default] instance_type = c5.large min_count = 0 max_count = 4

[ebs myebs] shared_dir = /shared volume_type = gp2 volume_size = 20

[aliases] ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS}

murugappanp commented 2 years ago

Thanks I am able to build the cluster with your input

sean-smith commented 1 year ago

Closing this