aws-samples / aws-parallelcluster-monitoring

Monitoring Dashboard for AWS ParallelCluster
MIT No Attribution
34 stars 24 forks source link

Modifications for docker installation #7

Closed coderodyhpc closed 11 months ago

coderodyhpc commented 3 years ago

Docker installation is modified to accommodate centos8. -Does not differentiate between centos7 and centos8 as it uses cfn_cluster_user -Cannot test from my forked version so I'm unsure if the docker installation needs further refining

coderodyhpc commented 3 years ago

Nicola, I couldn't test the PR itself because trying to launch the cluster with the post-script from a cloned version produces other errors:

Cluster creation failed.  Failed events:
  - AWS::CloudFormation::Stack MasterServerSubstack Embedded stack arn:aws:cloudformation:us-east-1:815706811166:stack/parallelcluster-w1cluster-MasterServerSubstack-1DPUMLMRTIPID/f0234560-77c8-11eb-bbe3-0a4789a43cff was not successfully created: The following resource(s) failed to create: [MasterServer].
    - AWS::CloudFormation::Stack parallelcluster-w1cluster-MasterServerSubstack-1DPUMLMRTIPID The following resource(s) failed to create: [MasterServer].
    - AWS::EC2::Instance MasterServer Received FAILURE signal with UniqueId i-0503c81da677abb80

It was too late and didn't get a chance to go over the chef logs, which I'll try to do later this morning. The only testing that I could complete was to install docker in a single instance running CentOS 8, which actually works fine. I doubt that it will work with CentOS7 (not sure if DNF is available and which features it has). A potential solution might be to use the OS rather than cfn_cluster_user as the selection criteria. I'll try to make some progress but to do so, the first thing is to be able to launch with a cloned postcript. Best, Arturo

afernandezody commented 3 years ago

The patch for CentOS8 installs docker and the previous error (shown in the log has been fixed). However, parallelcluster keeps launching and terminating instances. Unlike earlier, the logs don't show anything outstanding so I'll need to dig a bit deeper.

afernandezody commented 3 years ago

It should be ready.