aws / aws-parallelcluster

AWS ParallelCluster is an AWS supported Open Source cluster management tool to deploy and manage HPC clusters in the AWS cloud.
https://github.com/aws/aws-parallelcluster
Apache License 2.0
818 stars 309 forks source link

Rocky 9.4 bootstrapping bug #6275

Open gbts opened 1 month ago

gbts commented 1 month ago

There's a small issue preventing the headnode from booting on the new Rocky 9.4 AMIs. The failure is in this step:

cookbooks/aws-parallelcluster-environment/resources/system_authentication/system_authentication_rocky8.rb

authselect select sssd with-mkhomedir fails with the following error:

[error] File [/etc/pam.d/system-auth] exists but it needs to be overwritten!
[error] File [/etc/pam.d/password-auth] exists but it needs to be overwritten!
[error] File [/etc/pam.d/fingerprint-auth] exists but it needs to be overwritten!
[error] File [/etc/pam.d/smartcard-auth] exists but it needs to be overwritten!
[error] File [/etc/pam.d/postlogin] exists but it needs to be overwritten!
[error] File [/etc/nsswitch.conf] exists but it needs to be overwritten!
[error] File that needs to be overwritten was found
[error] Refusing to activate profile unless this file is removed or overwrite is requested.

Some unexpected changes to the configuration were detected.
Use --force parameter if you want to overwrite these changes.

As it suggests, adding the --force parameter fixes it, although I'm not sure if there any side-effects. I'm seeing a similar bug report on RHEL's issue tracker so this possibly affects RHEL 9.4 too.

himani2411 commented 1 week ago

Hi @gbts,

Sorry for the late reply.

Is the issue tracker that you mentioned is this one? https://forums.rockylinux.org/t/changed-permissions-on-etc-in-rl9-4-genericcloud-image/14449/3

If not, can you provide the link on that tracker?

Also, can we get the AMI ID that you are using so that I can replicate the issue.

Thanks