aws / aws-parallelcluster

AWS ParallelCluster is an AWS supported Open Source cluster management tool to deploy and manage HPC clusters in the AWS cloud.
https://github.com/aws/aws-parallelcluster
Apache License 2.0
828 stars 312 forks source link

can not ssh to worker nodes from master nodes using cfncluster to build HPC cluster #284

Closed miker2746 closed 6 years ago

miker2746 commented 6 years ago

Hello,

I started to learn how to use cfncluster to set up an HPC cluster in AWS. After I configured the config file of cfncluster I used it to build the cluster. It was successfully been set up. But after I connect to the master node, I found few problems with my cluster.

  1. I can't ssh to other worker nodes from my master node.
  2. the 'shared_dir' of the master node wasn't been shared to the worker nodes.

Could someone please tell me how to solve this two problems? Thank you very much

Michael

rajachan commented 6 years ago

Michael - Given the two symptoms, I wonder if the cfncluster-cookbooks failed to run successfully. Can you share your cfncluster config file, /var/log/cloud-init.log, and /var/log/cfn-init.log files?

miker2746 commented 6 years ago

Hi Rajachan,

  1. the following is the /var/log/cloud-init.log file of my master node.
2018-01-08 07:47:05,599 - util.py[DEBUG]: Cloud-init v. 17.1 running 'init-local' at Mon, 08 Jan 2018 07:47:05 +0000. Up 27.54 seconds.
2018-01-08 07:47:05,599 - main.py[DEBUG]: No kernel command line url found.
2018-01-08 07:47:05,599 - main.py[DEBUG]: Closing stdin.
2018-01-08 07:47:05,601 - util.py[DEBUG]: Writing to /var/log/cloud-init.log - ab: [644] 0 bytes
2018-01-08 07:47:05,602 - util.py[DEBUG]: Changing the ownership of /var/log/cloud-init.log to 104:4
2018-01-08 07:47:05,602 - util.py[DEBUG]: Attempting to remove /var/lib/cloud/instance/boot-finished
2018-01-08 07:47:05,602 - util.py[DEBUG]: Attempting to remove /var/lib/cloud/data/no-net
2018-01-08 07:47:05,603 - handlers.py[DEBUG]: start: init-local/check-cache: attempting to read from cache [check]
2018-01-08 07:47:05,603 - util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False)
2018-01-08 07:47:05,603 - stages.py[DEBUG]: no cache found
2018-01-08 07:47:05,603 - handlers.py[DEBUG]: finish: init-local/check-cache: SUCCESS: no cache found
2018-01-08 07:47:05,603 - util.py[DEBUG]: Attempting to remove /var/lib/cloud/instance
2018-01-08 07:47:05,606 - stages.py[DEBUG]: Using distro class <class 'cloudinit.distros.ubuntu.Distro'>
  1. I can't open the /var/log/cfn-init.log file. Every time I open this file the instance froze and I have to re-connect to the instance.

  2. the configure file was as follows. I only set the keyname and the vpc_setting and the rest setting were just the default settings.

    
    [cluster default]
    # Name of an existing EC2 KeyPair to enable SSH access to the instances.
    key_name = cfncluster-keypair1
    # Override path to cloudformation in S3
    # (defaults to https://s3.amazonaws.com/cfncluster-<aws_region_name>/templates/cfncluster-<version>.cfn.json)
    #template_url = https://s3.amazonaws.com/cfncluster-us-east-1/templates/cfncluster.cfn.json
    # Cluster Server EC2 instance type
    # (defaults to t2.micro for default template)
    #compute_instance_type = t2.micro
    # Master Server EC2 instance type
    # (defaults to t2.micro for default template
    #master_instance_type = t2.micro
    # Inital number of EC2 instances to launch as compute nodes in the cluster.
    # (defaults to 2 for default template)
    #initial_queue_size = 0
    # Maximum number of EC2 instances that can be launched in the cluster.
    # (defaults to 10 for the default template)
    #max_queue_size = 1
    # Boolean flag to set autoscaling group to maintain initial size and scale back
    # (defaults to false for the default template)
    #maintain_initial_size = false
    # Cluster scheduler
    # (defaults to sge for the default template)
    #scheduler = sge
    # Type of cluster to launch i.e. ondemand or spot
    # (defaults to ondemand for the default template)
    #cluster_type = ondemand
    # Spot price for the ComputeFleet
    #spot_price = 0.00
    # ID of a Custom AMI, to use instead of published AMI's
    #custom_ami = ami-9802b1e1
    #custom_ami = ami-ff8d1886
    #custom_ami = ami-62fa6e1b
    #custom_ami = ami-898b1ff0

Specify S3 resource which cfncluster nodes will be granted read-only access

(defaults to NONE for the default template)

s3_read_resource = NONE

Specify S3 resource which cfncluster nodes will be granted read-write access

(defaults to NONE for the default template)

s3_read_write_resource = NONE

URL to a preinstall script. This is executed before any of the bootas* scripts are run

(defaults to NONE for the default template)

pre_install = NONE

Arguments to be passed to preinstall script

(defaults to NONE for the default template)

pre_install_args = NONE

URL to a postinstall script. This is executed after any of the bootas* scripts are run

(defaults to NONE for the default template)

post_install = NONE

Arguments to be passed to postinstall script

(defaults to NONE for the default template)

post_install_args = NONE

HTTP(S) proxy server, typically http://x.x.x.x:8080

(defaults to NONE for the default template)

proxy_server = NONE

Cluster placement group. This placement group must already exist.

(defaults to NONE for the default template)

placement_group = cfncluster-pg-1

Cluster placment logic. This enables the whole cluster or only compute to use the placement group

(defaults to cluster in the default template)

placement = cluster

Path/mountpoint for ephemeral drives

(defaults to /scratch in the default template)

ephemeral_dir = /scratch

Path/mountpoint for shared EBS volume

(defaults to /shared in the default template)

shared_dir = /shared

Encrypted ephemeral drives. In-memory keys, non-recoverable.

(defaults to false in default template)

encrypted_ephemeral = false

MasterServer root volume size in GB. (AMI must support growroot)

(defaults to 10 in default template)

master_root_volume_size = 10

ComputeFleet root volume size in GB. (AMI must support growroot)

(defaults to 10 in default template)

compute_root_volume_size = 10

OS type used in the cluster

(defaults to alinux in the default template)

base_os = ubuntu

CloudWatch Logs region

(defaults to NONE in the default template)

cwl_region = NONE

CloudWatch Logs Log Group name

(defaults to NONE in the default template)

cwl_log_group = NONE

Existing EC2 IAM role to be assosiated with the EC2 instances

(defaults to NONE in the default template)

ec2_iam_role = NONE

Extra Json to be merged with the dna.json used by Chef

(defaults to {} in the default template)

extra_json = {}

Additional CloudFormation template to launch with the cluster

additional_cfn_template = NONE

Settings section relating to VPC to be used

vpc_settings = mycluster1-vpc

Settings section relating to EBS volume

ebs_settings = fds-test-volume-2

Settings section relation to scaling

scaling_settings = custom


and my vpc_setting is as follows.

[vpc mycluster1-vpc] master_subnet_id = subnet-4ba6f52c vpc_id = vpc-e5446782

miker2746 commented 6 years ago

hi, I updated the cfncluster and found out what I was wrong. I should use ssh private ID number to connect to the worker nodes, or I should write them to the /etc/hosts file with some custom code names.

Thank you for answering my question.

best regards, Michael

rajachan commented 6 years ago

Michael - You need not configure anything manually to SSH from the master into the compute nodes. The Chef cookbook already does the heavy-lifting for you. It is hard to say what exactly happened without looking at the cfn-init log. I don't see a correlation between opening the log file and your instance freezing up; it might have been something transient. See if you can at least get the last couple lines using tail (tail -n 100 /var/log/cfn-init.log); that will be really useful in understanding the problem.

miker2746 commented 6 years ago

Hi rajachan,

I launched a new cluster, the NFS still failed to set up, here is the /var/log/cfn-init.log file of the new cluster.

`2018-01-09 23:08:18,966 [DEBUG] CloudFormation client initialized with endpoint https://cloudformation.eu-west-1.amazonaws.com 2018-01-09 23:08:18,966 [DEBUG] Describing resource MasterServer in stack cfncluster-mycluster2 2018-01-09 23:08:19,081 [INFO] -----------------------Starting build----------------------- 2018-01-09 23:08:19,082 [DEBUG] Not setting a reboot trigger as scheduling support is not available 2018-01-09 23:08:19,083 [INFO] Running configSets: default 2018-01-09 23:08:19,084 [INFO] Running configSet default 2018-01-09 23:08:19,085 [INFO] Running config deployConfigFiles 2018-01-09 23:08:19,086 [DEBUG] No packages specified 2018-01-09 23:08:19,086 [DEBUG] No groups specified 2018-01-09 23:08:19,086 [DEBUG] No users specified 2018-01-09 23:08:19,086 [DEBUG] No sources specified 2018-01-09 23:08:19,086 [DEBUG] Writing content to /etc/chef/client.rb 2018-01-09 23:08:19,086 [DEBUG] Setting mode for /etc/chef/client.rb to 000644 2018-01-09 23:08:19,087 [DEBUG] Setting owner 0 and group 0 for /etc/chef/client.rb 2018-01-09 23:08:19,087 [DEBUG] Writing content to /tmp/dna.json 2018-01-09 23:08:19,087 [DEBUG] Content will be serialized as a JSON structure 2018-01-09 23:08:19,087 [DEBUG] Setting mode for /tmp/dna.json to 000644 2018-01-09 23:08:19,087 [DEBUG] Setting owner 0 and group 0 for /tmp/dna.json 2018-01-09 23:08:19,087 [DEBUG] Writing content to /tmp/extra.json 2018-01-09 23:08:19,087 [DEBUG] Setting mode for /tmp/extra.json to 000644 2018-01-09 23:08:19,088 [DEBUG] Setting owner 0 and group 0 for /tmp/extra.json 2018-01-09 23:08:19,088 [DEBUG] Running command jq 2018-01-09 23:08:19,088 [DEBUG] No test for command jq 2018-01-09 23:08:19,096 [INFO] Command jq succeeded 2018-01-09 23:08:19,096 [DEBUG] Command jq output: 2018-01-09 23:08:19,096 [DEBUG] Running command mkdir 2018-01-09 23:08:19,097 [DEBUG] No test for command mkdir 2018-01-09 23:08:19,099 [INFO] Command mkdir succeeded 2018-01-09 23:08:19,099 [DEBUG] Command mkdir output: 2018-01-09 23:08:19,100 [DEBUG] Running command touch 2018-01-09 23:08:19,100 [DEBUG] No test for command touch 2018-01-09 23:08:19,102 [INFO] Command touch succeeded 2018-01-09 23:08:19,102 [DEBUG] Command touch output: 2018-01-09 23:08:19,102 [DEBUG] No services specified 2018-01-09 23:08:19,104 [INFO] Running config getCookbooks 2018-01-09 23:08:19,105 [DEBUG] No packages specified 2018-01-09 23:08:19,105 [DEBUG] No groups specified 2018-01-09 23:08:19,105 [DEBUG] No users specified 2018-01-09 23:08:19,105 [DEBUG] No sources specified 2018-01-09 23:08:19,105 [DEBUG] No files specified 2018-01-09 23:08:19,105 [DEBUG] Running command berk 2018-01-09 23:08:19,105 [DEBUG] No test for command berk 2018-01-09 23:08:52,045 [INFO] Command berk succeeded 2018-01-09 23:08:52,045 [DEBUG] Command berk output: Resolving cookbook dependencies... Fetching 'cfncluster' from source at . Fetching cookbook index from https://supermarket.getchef.com... Installing apt (6.1.4) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing build-essential (8.0.4) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Using cfncluster (1.4.0) from source at . Installing compat_resource (12.19.0) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing hostname (0.4.2) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing hostsfile (3.0.1) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing iptables (4.3.1) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing line (0.6.3) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing mingw (2.0.1) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing ohai (5.2.0) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing openssh (2.4.1) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing poise (2.8.1) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing poise-archive (1.5.0) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing poise-languages (2.1.1) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing poise-python (1.6.0) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing seven_zip (2.0.2) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing sysctl (0.10.2) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing tar (2.0.0) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing windows (3.4.3) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing yum (5.0.1) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Installing yum-epel (2.1.2) from https://supermarket.getchef.com/ ([opscode] https://supermarket.chef.io:443/api/v1) Vendoring apt (6.1.4) to /etc/chef/cookbooks/apt Vendoring build-essential (8.0.4) to /etc/chef/cookbooks/build-essential Vendoring cfncluster (1.4.0) to /etc/chef/cookbooks/cfncluster Vendoring compat_resource (12.19.0) to /etc/chef/cookbooks/compat_resource Vendoring hostname (0.4.2) to /etc/chef/cookbooks/hostname Vendoring hostsfile (3.0.1) to /etc/chef/cookbooks/hostsfile Vendoring iptables (4.3.1) to /etc/chef/cookbooks/iptables Vendoring line (0.6.3) to /etc/chef/cookbooks/line Vendoring mingw (2.0.1) to /etc/chef/cookbooks/mingw Vendoring ohai (5.2.0) to /etc/chef/cookbooks/ohai Vendoring openssh (2.4.1) to /etc/chef/cookbooks/openssh Vendoring poise (2.8.1) to /etc/chef/cookbooks/poise Vendoring poise-archive (1.5.0) to /etc/chef/cookbooks/poise-archive Vendoring poise-languages (2.1.1) to /etc/chef/cookbooks/poise-languages Vendoring poise-python (1.6.0) to /etc/chef/cookbooks/poise-python Vendoring seven_zip (2.0.2) to /etc/chef/cookbooks/seven_zip Vendoring sysctl (0.10.2) to /etc/chef/cookbooks/sysctl Vendoring tar (2.0.0) to /etc/chef/cookbooks/tar Vendoring windows (3.4.3) to /etc/chef/cookbooks/windows Vendoring yum (5.0.1) to /etc/chef/cookbooks/yum Vendoring yum-epel (2.1.2) to /etc/chef/cookbooks/yum-epel

2018-01-09 23:08:52,046 [DEBUG] No services specified 2018-01-09 23:08:52,048 [INFO] Running config chefPrepEnv 2018-01-09 23:08:52,048 [DEBUG] No packages specified 2018-01-09 23:08:52,048 [DEBUG] No groups specified 2018-01-09 23:08:52,048 [DEBUG] No users specified 2018-01-09 23:08:52,048 [DEBUG] No sources specified 2018-01-09 23:08:52,048 [DEBUG] No files specified 2018-01-09 23:08:52,048 [DEBUG] Running command chef 2018-01-09 23:08:52,048 [DEBUG] No test for command chef 2018-01-09 23:08:58,022 [INFO] Command chef succeeded 2018-01-09 23:08:58,022 [DEBUG] Command chef output: [2018-01-09T23:08:53+00:00] INFO: Forking chef instance to converge... Starting Chef Client, version 12.19.36 [2018-01-09T23:08:53+00:00] INFO: Chef 12.19.36 [2018-01-09T23:08:53+00:00] INFO: Platform: x86_64-linux [2018-01-09T23:08:53+00:00] INFO: Chef-client pid: 2046 [2018-01-09T23:08:55+00:00] INFO: HTTP Request Returned 404 Not Found: Object not found: chefzero://localhost:8889/nodes/ip-10-0-0-68.eu-west-1.compute.internal [2018-01-09T23:08:55+00:00] INFO: Setting the run_list to recipe[cfncluster::sge_config] from CLI options [2018-01-09T23:08:55+00:00] WARN: Run List override has been provided. [2018-01-09T23:08:55+00:00] WARN: Original Run List: [recipe[cfncluster::sge_config]] [2018-01-09T23:08:55+00:00] WARN: Overridden Run List: [recipe[cfncluster::_prep_env]] [2018-01-09T23:08:55+00:00] INFO: Run List is [recipe[cfncluster::_prep_env]] [2018-01-09T23:08:55+00:00] INFO: Run List expands to [cfncluster::_prep_env] [2018-01-09T23:08:55+00:00] INFO: Starting Chef Run for ip-10-0-0-68.eu-west-1.compute.internal [2018-01-09T23:08:55+00:00] INFO: Running start handlers [2018-01-09T23:08:55+00:00] INFO: Start handlers complete. [2018-01-09T23:08:55+00:00] INFO: HTTP Request Returned 404 Not Found: Object not found: resolving cookbooks for run list: ["cfncluster::_prep_env"] [2018-01-09T23:08:56+00:00] INFO: Loading cookbooks [cfncluster@1.4.0, build-essential@8.0.4, poise-python@1.6.0, tar@2.0.0, selinux@2.0.3, nfs@2.4.1, sysctl@0.10.2, yum@5.0.1, yum-epel@2.1.2, openssh@2.4.1, apt@6.1.4, hostname@0.4.2, line@0.6.3, seven_zip@2.0.2, mingw@2.0.1, poise@2.8.1, poise-languages@2.1.1, ohai@5.2.0, compat_resource@12.19.0, iptables@4.3.1, hostsfile@3.0.1, windows@3.4.3, poise-archive@1.5.0] [2018-01-09T23:08:56+00:00] INFO: Skipping removal of obsoleted cookbooks from the cache Synchronizing Cookbooks: [2018-01-09T23:08:56+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_compute_base_config.rb in the cache. [2018-01-09T23:08:56+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_compute_custom_config.rb in the cache. [2018-01-09T23:08:56+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_compute_sge_config.rb in the cache. [2018-01-09T23:08:56+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_compute_slurm_config.rb in the cache. [2018-01-09T23:08:56+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_compute_torque_config.rb in the cache. [2018-01-09T23:08:56+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_ganglia_install.rb in the cache. [2018-01-09T23:08:56+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_master_base_config.rb in the cache. [2018-01-09T23:08:56+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_master_custom_config.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_master_slurm_config.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_master_torque_config.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_nvidia_install.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_setup_python.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_undo_base_config.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_undo_master_base_config.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_update_packages.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/base_config.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_ec2_udev_rules.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/base_install.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_master_sge_config.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/custom_install.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/default.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/image_prep.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/sge_config.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/sge_install.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/_prep_env.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/slurm_install.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/torque_config.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/libraries/helpers.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/custom_config.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/amazon/supervisord-init in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/centos-7/ganglia-webfrontend.conf in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/ami_cleanup.sh in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/slurm_config.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/attachVolume.py in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/munge_install.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/cfncluster-ebsnvme-id in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/recipes/torque_install.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/compute_ready in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/CfnCluster-License-README.txt in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/ec2-volid.rules in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/ec2_dev_2_volid.py in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/blacklist-nouveau.conf in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/setup-ephemeral-drives.sh in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/sge_inst.conf in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/slurm-init in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/munge-init in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/slurmctld.service in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/supervisord-init in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/ganglia-webfrontend.conf in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/supervisord.conf in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/torque.setup in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/slurmd.service in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/attributes/default.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/ubuntu-14.04/ec2blkdev-init in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/ubuntu-14.04/slurm-init in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/fetch_and_run in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/configure-pat.sh in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/ubuntu-16.04/supervisord-init in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/slurm.sh in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/99-cfncluster-user-tty.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/cfncluster_supervisord.conf.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/ec2blkdev-init in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/ubuntu-16.04/ec2blkdev-init in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/gmond.conf.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/jq-1.4 in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/slurm.csh in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/publish_pending.sge.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/nodewatcher.cfg.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/default/torque.sh in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/publish_pending.torque.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/munge.key.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/slurm.conf.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/amazon/cfncluster_supervisord.conf.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/gmetad.conf.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/torque.conf.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/publish_pending.pbspro.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/torque.config.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/files/ubuntu-14.04/supervisord-init in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/sqswatcher.cfg.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/torque.setup.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/ubuntu/gmond.conf.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/packer_update_centos_base.json in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/lsb.hosts.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/ubuntu/cfncluster_supervisord.conf.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/packer_variables.json in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/publish_pending.slurm.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/torque.server_name.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/LICENSE.txt in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/templates/default/cfnconfig.erb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/README.md in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/build_env_setup.sh in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/centos-upgrade-second-stage.sh in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/Gemfile in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/centos6.elrepo.repo in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/packer_centos7.json in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/chefignore in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/metadata.json in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/packer_ubuntu1604.json in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/centos-upgrade-first-stage.sh in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/packer_ubuntu1404.json in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/build_ami.sh in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/.rubocop.yml in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/packer_centos6.json in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/NOTICE.txt in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/.kitchen.yml in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/packer_alinux.json in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/Rakefile in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/build-essential/resources/build_essential.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/build-essential/resources/xcode_command_line_tools.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/.kitchen.cloud.yml in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/build-essential/recipes/default.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/build-essential/README.md in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/build-essential/CONTRIBUTING.md in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/cfncluster/CHANGELOG.md in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/build-essential/metadata.json in the cache.

[2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/build-essential/CHANGELOG.md in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/poise-python/recipes/default.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/poise-python/libraries/default.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/poise-python/attributes/default.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/poise-python/files/halite_gem/poise_python/cheftie.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/poise-python/files/halite_gem/poise_python/error.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/build-essential/MAINTAINERS.md in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/build-essential/.foodcritic in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/poise-python/files/halite_gem/poise_python/python_command_mixin.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/poise-python/files/halite_gem/poise_python/python_providers/dummy.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/poise-python/files/halite_gem/poise_python/python_providers/msi.rb in the cache. [2018-01-09T23:08:57+00:00] INFO: Storing updated cookbooks/poise-python/files/halite_gem/poise_python/python_providers/portable_pypy.rb in the cache.

Running handlers: [2018-01-09T23:08:57+00:00] INFO: Running report handlers Running handlers complete [2018-01-09T23:08:57+00:00] INFO: Report handlers complete Chef Client finished, 4/7 resources updated in 04 seconds

2018-01-09 23:08:58,024 [DEBUG] No services specified 2018-01-09 23:08:58,025 [INFO] Running config shellRunPreInstall 2018-01-09 23:08:58,026 [DEBUG] No packages specified 2018-01-09 23:08:58,026 [DEBUG] No groups specified 2018-01-09 23:08:58,026 [DEBUG] No users specified 2018-01-09 23:08:58,026 [DEBUG] No sources specified 2018-01-09 23:08:58,026 [DEBUG] No files specified 2018-01-09 23:08:58,026 [DEBUG] Running command runpreinstall 2018-01-09 23:08:58,026 [DEBUG] No test for command runpreinstall 2018-01-09 23:08:58,047 [INFO] Command runpreinstall succeeded 2018-01-09 23:08:58,047 [DEBUG] Command runpreinstall output: 2018-01-09 23:08:58,047 [DEBUG] No services specified 2018-01-09 23:08:58,048 [INFO] Running config chefConfig 2018-01-09 23:08:58,049 [DEBUG] No packages specified 2018-01-09 23:08:58,049 [DEBUG] No groups specified 2018-01-09 23:08:58,049 [DEBUG] No users specified 2018-01-09 23:08:58,049 [DEBUG] No sources specified 2018-01-09 23:08:58,049 [DEBUG] No files specified 2018-01-09 23:08:58,049 [DEBUG] Running command chef 2018-01-09 23:08:58,049 [DEBUG] No test for command chef 2018-01-09 23:09:37,678 [INFO] Command chef succeeded 2018-01-09 23:09:37,679 [DEBUG] Command chef output: [2018-01-09T23:08:59+00:00] INFO: Forking chef instance to converge... Starting Chef Client, version 12.19.36 [2018-01-09T23:08:59+00:00] INFO: Chef 12.19.36 [2018-01-09T23:08:59+00:00] INFO: Platform: x86_64-linux [2018-01-09T23:08:59+00:00] INFO: Chef-client pid: 2372 [2018-01-09T23:09:00+00:00] INFO: Setting the run_list to recipe[cfncluster::sge_config] from CLI options [2018-01-09T23:09:00+00:00] INFO: Run List is [recipe[cfncluster::sge_config]] [2018-01-09T23:09:00+00:00] INFO: Run List expands to [cfncluster::sge_config] [2018-01-09T23:09:00+00:00] INFO: Starting Chef Run for ip-10-0-0-68.eu-west-1.compute.internal [2018-01-09T23:09:00+00:00] INFO: Running start handlers [2018-01-09T23:09:00+00:00] INFO: Start handlers complete. [2018-01-09T23:09:00+00:00] INFO: HTTP Request Returned 404 Not Found: Object not found: resolving cookbooks for run list: ["cfncluster::sge_config"] [2018-01-09T23:09:01+00:00] INFO: Loading cookbooks [cfncluster@1.4.0, build-essential@8.0.4, poise-python@1.6.0, tar@2.0.0, selinux@2.0.3, nfs@2.4.1, sysctl@0.10.2, yum@5.0.1, yum-epel@2.1.2, openssh@2.4.1, apt@6.1.4, hostname@0.4.2, line@0.6.3, seven_zip@2.0.2, mingw@2.0.1, poise@2.8.1, poise-languages@2.1.1, ohai@5.2.0, compat_resource@12.19.0, iptables@4.3.1, hostsfile@3.0.1, windows@3.4.3, poise-archive@1.5.0] Synchronizing Cookbooks:

[2018-01-09T23:09:26+00:00] INFO: append_if_no_line[export /home/ebs] sending run action to execute[exportfs] (immediate)

[2018-01-09T23:09:26+00:00] INFO: execute[exportfs] ran successfully

[2018-01-09T23:09:26+00:00] INFO: append_if_no_line[export /home] sending run action to execute[exportfs] (immediate)

[2018-01-09T23:09:26+00:00] INFO: execute[exportfs] ran successfully

[2018-01-09T23:09:30+00:00] INFO: append_if_no_line[export /opt/sge] sending run action to execute[exportfs] (immediate)

[2018-01-09T23:09:30+00:00] INFO: execute[exportfs] ran successfully

Running handlers: [2018-01-09T23:09:37+00:00] INFO: Running report handlers Running handlers complete [2018-01-09T23:09:37+00:00] INFO: Report handlers complete

Deprecated features used! Cloning resource attributes for directory[/home/ebs] from prior resource Previous directory[/home/ebs]: /etc/chef/local-mode-cache/cache/cookbooks/cfncluster/recipes/_master_base_config.rb:54:in from_file' Current directory[/home/ebs]: /etc/chef/local-mode-cache/cache/cookbooks/cfncluster/recipes/_master_base_config.rb:72:infrom_file' at 1 location:

Chef Client finished, 62/190 resources updated in 38 seconds

2018-01-09 23:09:37,681 [DEBUG] No services specified 2018-01-09 23:09:37,682 [INFO] Running config shellRunPostInstall 2018-01-09 23:09:37,682 [DEBUG] No packages specified 2018-01-09 23:09:37,682 [DEBUG] No groups specified 2018-01-09 23:09:37,682 [DEBUG] No users specified 2018-01-09 23:09:37,682 [DEBUG] No sources specified 2018-01-09 23:09:37,683 [DEBUG] No files specified 2018-01-09 23:09:37,683 [DEBUG] Running command runpostinstall 2018-01-09 23:09:37,683 [DEBUG] No test for command runpostinstall 2018-01-09 23:09:37,688 [INFO] Command runpostinstall succeeded 2018-01-09 23:09:37,688 [DEBUG] Command runpostinstall output: 2018-01-09 23:09:37,688 [DEBUG] No services specified 2018-01-09 23:09:37,689 [INFO] Running config shellForkClusterReadyInstall 2018-01-09 23:09:37,690 [DEBUG] No packages specified 2018-01-09 23:09:37,690 [DEBUG] No groups specified 2018-01-09 23:09:37,690 [DEBUG] No users specified 2018-01-09 23:09:37,690 [DEBUG] No sources specified 2018-01-09 23:09:37,690 [DEBUG] No files specified 2018-01-09 23:09:37,690 [DEBUG] Running command clusterreadyinstall 2018-01-09 23:09:37,690 [DEBUG] No test for command clusterreadyinstall 2018-01-09 23:09:37,695 [INFO] Command clusterreadyinstall succeeded 2018-01-09 23:09:37,695 [DEBUG] Command clusterreadyinstall output: Unknown action. Exit gracefully

2018-01-09 23:09:37,696 [DEBUG] No services specified 2018-01-09 23:09:37,696 [INFO] ConfigSets completed 2018-01-09 23:09:37,696 [DEBUG] Not clearing reboot trigger as scheduling support is not available 2018-01-09 23:09:37,696 [INFO] -----------------------Build complete----------------------- 2018-01-09 23:09:37,872 [DEBUG] CloudFormation client initialized with endpoint https://cloudformation.eu-west-1.amazonaws.com 2018-01-09 23:09:37,872 [DEBUG] Signaling resource MasterServer in stack cfncluster-mycluster2 with unique ID i-0383682101f9a0ed3 and status SUCCESS`

miker2746 commented 6 years ago

and every time I set the shard_dir=/home of the config file, I can't ssh to the master node from my computer, because I received this failure

totoro@TOTORO:~$ ssh -i ~/aws/key-pair/cfncluster-keypair1.pem ubuntu@34.243.79.255
The authenticity of host '34.243.79.255 (34.243.79.255)' can't be established.
ECDSA key fingerprint is SHA256:ZRZJSLAX39zWddllC9mqW+gN5sDXfQD66eWlTCGswKM.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '34.243.79.255' (ECDSA) to the list of known hosts.
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).

the config file was like this:

[cluster testcluster1]
# Name of an existing EC2 KeyPair to enable SSH access to the instances.
key_name = cfncluster-keypair1
# Override path to cloudformation in S3
# (defaults to https://s3.amazonaws.com/cfncluster-<aws_region_name>/templates/cfncluster-<version>.cfn.json)
#template_url = https://s3.amazonaws.com/cfncluster-us-east-1/templates/cfncluster.cfn.json
# Cluster Server EC2 instance type
# (defaults to t2.micro for default template)
#compute_instance_type = t2.micro
# Master Server EC2 instance type
# (defaults to t2.micro for default template
#master_instance_type = t2.micro
# Inital number of EC2 instances to launch as compute nodes in the cluster.
# (defaults to 2 for default template)
initial_queue_size = 1
# Maximum number of EC2 instances that can be launched in the cluster.
# (defaults to 10 for the default template)
max_queue_size = 2
# Boolean flag to set autoscaling group to maintain initial size and scale back
# (defaults to false for the default template)
#maintain_initial_size = false
# Cluster scheduler
# (defaults to sge for the default template)
#scheduler = sge
#scheduler = sge
# Type of cluster to launch i.e. ondemand or spot
# (defaults to ondemand for the default template)
#cluster_type = ondemand
# Spot price for the ComputeFleet
#spot_price = 0.00

# ID of a Custom AMI, to use instead of published AMI's
# must find the available AMI
# AMI Name: cfncluster-1.3.0-ubuntu-1604-lts-hvm-201608251414
#custom_ami = ami-406e1f33
#custom_ami = ami-ff8d1886
#custom_ami = ami-96b025ef
#custom_ami = ami-62fa6e1b

# cfncluster fds-image, no NFS
custom_ami = ami-898b1ff0

# cfncluster default ubuntu1604 image in eu-west-1
#custom_ami = ami-9802b1e1

# Specify S3 resource which cfncluster nodes will be granted read-only access
# (defaults to NONE for the default template)
#s3_read_resource = arn:aws:s3:::cfncluster1-s3
# Specify S3 resource which cfncluster nodes will be granted read-write access
# (defaults to NONE for the default template)
#s3_read_write_resource = arn:aws:s3:::cfncluster1-s3
# URL to a preinstall script. This is executed before any of the boot_as_* scripts are run
# (defaults to NONE for the default template)
#pre_install = NONE
# Arguments to be passed to preinstall script
# (defaults to NONE for the default template)
#pre_install_args = NONE
# URL to a postinstall script. This is executed after any of the boot_as_* scripts are run
# (defaults to NONE for the default template)
#post_install = NONE
# Arguments to be passed to postinstall script
# (defaults to NONE for the default template)
#post_install_args = NONE
# HTTP(S) proxy server, typically http://x.x.x.x:8080
# (defaults to NONE for the default template)
#proxy_server = NONE
# Cluster placement group. This placement group must already exist.
# (defaults to NONE for the default template)
#placement_group = NONE
# Cluster placment logic. This enables the whole cluster or only compute to use the placement group
# (defaults to cluster in the default template)
#placement = cluster
# Path/mountpoint for ephemeral drives
# (defaults to /scratch in the default template)
#ephemeral_dir = /scratch

# Path/mountpoint for shared EBS volume
# (defaults to /shared in the default template)

#### if i set this to /home, then all nodes' home directories from computer fleet
#### will be shared through NFS system. Not that AWS EFS but the original NFS system
#shared_dir = /home/ubuntu/ebs
shared_dir = /home

# Encrypted ephemeral drives. In-memory keys, non-recoverable.
# (defaults to false in default template)
#encrypted_ephemeral = false
# MasterServer root volume size in GB. (AMI must support growroot)
# (defaults to 10 in default template)
#master_root_volume_size = 10
# ComputeFleet root volume size in GB. (AMI must support growroot)
# (defaults to 10 in default template)
#compute_root_volume_size = 10

# OS type used in the cluster
# (defaults to alinux in the default template)
#base_os = Ubuntu

# CloudWatch Logs region
# (defaults to NONE in the default template)
#cwl_region = NONE
# CloudWatch Logs Log Group name
# (defaults to NONE in the default template)
#cwl_log_group = NONE
# Existing EC2 IAM role to be assosiated with the EC2 instances
# (defaults to NONE in the default template)
#ec2_iam_role = NONE
# Extra Json to be merged with the dna.json used by Chef
# (defaults to {} in the default template)
#extra_json = {}
# Additional CloudFormation template to launch with the cluster
#additional_cfn_template = NONE
# Settings section relating to VPC to be used
#vpc_settings = cfncluster-vpc-test1
#vpc_settings = mycluster1-vpc

#test vpc_settings
vpc_settings = mycluster2-vpc

# Settings section relating to EBS volume
#ebs_settings = fds-test-volume-2
# Settings section relation to scaling
#scaling_settings = custom

I have no idea what was wrong..

Hope you could help me out.

best regards, Michael

rajachan commented 6 years ago

Michael - Sorry I did not respond to this sooner. This is related to the issue in https://github.com/awslabs/cfncluster/issues/322. By mounting the shared directory on /home via the shared_dir config, you are effectively making the contents of the master node's default /home that is in the Master Server's primary EBS volume inaccessible. As a result, the keypair you intended to use to SSH into the master would no longer be available for the SSH authentication. I am going to close this issue, given it is related to https://github.com/awslabs/cfncluster/issues/322, and to avoid having two separate threads to discuss this.