aws / aws-parallelcluster

AWS ParallelCluster is an AWS supported Open Source cluster management tool to deploy and manage HPC clusters in the AWS cloud.
https://github.com/aws/aws-parallelcluster
Apache License 2.0
830 stars 312 forks source link

Rocky 8 update from 8.9 to 8.10 seems break pcluster build-image #6293

Closed zeekus closed 4 months ago

zeekus commented 4 months ago

Required Info:

Bug description and how to reproduce:

cat /etc/rocky-release
Rocky Linux release 8.9 (Green Obsidian)
[rocky@ip-10-148-64-97 ~]$ cat /etc/rocky-release
rocky-release           rocky-release-upstream
[rocky@ip-10-148-64-97 ~]$ cat /etc/os-release
NAME="Rocky Linux"
VERSION="8.9 (Green Obsidian)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.9"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Rocky Linux 8.9 (Green Obsidian)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:8:GA"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
SUPPORT_END="2029-05-31"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-8"
ROCKY_SUPPORT_PRODUCT_VERSION="8.9"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.9"
himani2411 commented 4 months ago

Hi zeekus,

Thank you for Identifying the Bug.

We are working on the Bug Fix. Are you currently blocked by this? If yes, we can try to find a workaround for you.

zeekus commented 4 months ago

Hi zeekus,

Thank you for Identifying the Bug.

We are working on the Bug Fix. Are you currently blocked by this? If yes, we can try to find a workaround for you.

Thanks. The work around is just not to run updates. It is not the most secure approach, but it works.

rmarable-flaretx commented 4 months ago

Is there an ETA on a fix?

himani2411 commented 4 months ago

Hi rmarable-flaretx and zeekus,

We dont have an ETA on the next release.

For a current workaround

  1. You can clone the ParallelCluster v3.9.2 cookbook.
  2. Make changes in cookbooks/aws-parallelcluster-environment/resources/lustre/lustre_redhat8.rb and cookbooks/aws-parallelcluster-environment/resources/lustre/lustre_rocky8.rb as per the 2 PR's below
  3. Push the changes to their cloned Cookbook repo or Upload the Cookbook in your S3 bucket.
  4. Then use the below setting in their Image build Config
    
    For Cookbook Repo
    DevSettings:
    Cookbook:
    ChefCookbook: https://github.com/<UserName>/aws-parallelcluster-cookbook/tarball/<BranchName>

For S3: DevSettings: Cookbook: ChefCookbook: s3:////aws-parallelcluster-cookbook.tgz

hanwen-pcluste commented 4 months ago

This issue has been fixed by 3.10.0, which was released last Thursday. Thank you for reporting the issue to us!