aws-samples / aws-eda-slurm-cluster

AWS Slurm Cluster for EDA Workloads
MIT No Attribution
28 stars 7 forks source link

HeadNode fails to configure due to ansible change. on_head_node_configured.sh fails as ansible has deprecated ansible.builtin.include #238

Closed gwolski closed 4 months ago

gwolski commented 5 months ago

Using latest version of aws-eda-slurm-cluster as of May 15.

Headnode fails to configure. cloud init fails. Reproduce by deploying new cluster with latest ansible

Tracked things down so far to the shell script on_head_node_configured.sh. Error message is:

<13>Jun 10 16:48:18 on_head_node_configured.sh: TASK [all : Create /var/lib/cloud/scripts/per-boot/90_mount_ssds.bash] ********* <13>Jun 10 16:48:18 on_head_node_configured.sh: ok: [local] <13>Jun 10 16:48:18 on_head_node_configured.sh: <13>Jun 10 16:48:18 on_head_node_configured.sh: TASK [all : Execute /var/lib/cloud/scripts/per-boot/90_mount_ssds.bash] ******** <13>Jun 10 16:48:18 on_head_node_configured.sh: ok: [local] <13>Jun 10 16:48:18 on_head_node_configured.sh: <13>Jun 10 16:48:18 on_head_node_configured.sh: TASK [all : Give /tmp write permissions] *************************************** <13>Jun 10 16:48:18 on_head_node_configured.sh: ok: [local] <13>Jun 10 16:48:18 on_head_node_configured.sh: ERROR! [DEPRECATED]: ansible.builtin.include has been removed. Use include_tasks or import_tasks instead. This feature was removed from ansible-core in a release after 2023-05-16. Please update your playbooks. <13>Jun 10 16:48:18 on_head_node_configured.sh: + on_exit <13>Jun 10 16:48:18 on_head_node_configured.sh: + rc=1 <13>Jun 10 16:48:18 on_head_node_configured.sh: + set +e <13>Jun 10 16:48:18 on_head_node_configured.sh: + [[ 1 -ne 0 ]] <13>Jun 10 16:48:18 on_head_node_configured.sh: + [[ : != \: ]] include is used in /opt/slurm/config/ansible/playbooks/roles/ParallelClusterHeadNode/tasks/main.yml
gwolski commented 5 months ago

Changing tasks/main.yml to call ansible.builtin.import_tasks allows the on_head_node_configured.sh to run. Here is the updated tasks/main.yml. I do not know enough about the replacement tasks to be 100% sure this is the best/right:


Will attempt to build a new cluster.

gwolski commented 4 months ago

New cluster is working, build completed cleanly. Please do not trust my selection of ansible import_tasks as replacement for original "include" command.