aws / aws-parallelcluster

AWS ParallelCluster is an AWS supported Open Source cluster management tool to deploy and manage HPC clusters in the AWS cloud.
https://github.com/aws/aws-parallelcluster
Apache License 2.0
816 stars 309 forks source link

Rocky 8 update from 8.9 to 8.10 seems break pcluster build-image #6293

Closed zeekus closed 2 days ago

zeekus commented 3 weeks ago

Required Info:

Bug description and how to reproduce:

cat /etc/rocky-release
Rocky Linux release 8.9 (Green Obsidian)
[rocky@ip-10-148-64-97 ~]$ cat /etc/rocky-release
rocky-release           rocky-release-upstream
[rocky@ip-10-148-64-97 ~]$ cat /etc/os-release
NAME="Rocky Linux"
VERSION="8.9 (Green Obsidian)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.9"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Rocky Linux 8.9 (Green Obsidian)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:8:GA"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
SUPPORT_END="2029-05-31"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-8"
ROCKY_SUPPORT_PRODUCT_VERSION="8.9"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.9"
himani2411 commented 1 week ago

Hi zeekus,

Thank you for Identifying the Bug.

We are working on the Bug Fix. Are you currently blocked by this? If yes, we can try to find a workaround for you.

zeekus commented 1 week ago

Hi zeekus,

Thank you for Identifying the Bug.

We are working on the Bug Fix. Are you currently blocked by this? If yes, we can try to find a workaround for you.

Thanks. The work around is just not to run updates. It is not the most secure approach, but it works.

rmarable-flaretx commented 1 week ago

Is there an ETA on a fix?

himani2411 commented 1 week ago

Hi rmarable-flaretx and zeekus,

We dont have an ETA on the next release.

For a current workaround

  1. You can clone the ParallelCluster v3.9.2 cookbook.
  2. Make changes in cookbooks/aws-parallelcluster-environment/resources/lustre/lustre_redhat8.rb and cookbooks/aws-parallelcluster-environment/resources/lustre/lustre_rocky8.rb as per the 2 PR's below
  3. Push the changes to their cloned Cookbook repo or Upload the Cookbook in your S3 bucket.
  4. Then use the below setting in their Image build Config
    
    For Cookbook Repo
    DevSettings:
    Cookbook:
    ChefCookbook: https://github.com/<UserName>/aws-parallelcluster-cookbook/tarball/<BranchName>

For S3: DevSettings: Cookbook: ChefCookbook: s3:////aws-parallelcluster-cookbook.tgz

hanwen-pcluste commented 2 days ago

This issue has been fixed by 3.10.0, which was released last Thursday. Thank you for reporting the issue to us!